by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
Jai Verma
@jaiverma
Hello Marcin!
Fabian Yamaguchi
@fabsx00
Just an FYI, we're working on new documentation for Ocular, and lots of it applies to Joern: https://docs.shiftleft.io/ocular/quickstart
Manas Shukla
@noodlecaake_twitter
Hello Everyone
I was looking for clustering approach using joern.
I would be thankful if there is any detailed documentation and github link for the same
Manas Shukla
@noodlecaake_twitter
Also, how are there two documentation links
https://joern.io/docs/
https://joern.readthedocs.io/en/latest/
Jai Verma
@jaiverma
@noodlecaake_twitter, first link is the new documentation. second one is the documentation for the old joern (https://github.com/octopus-platform/joern)
Manas Shukla
@noodlecaake_twitter
so does the new one support clustering ?
Manas Shukla
@noodlecaake_twitter
I am looking for Machine Learning Based approach to detect vulnerabilities guided by Joern (Query Initialization)
Jai Verma
@jaiverma
i don't understand your question. do you mean graph clustering?
joern internally does use a graph database, so I guess you could do graph clustering...
Niko Schmidt
@itsacoderepo
There are two types of joern users, university want to do machine learning and people from the industry, want create queries for their work. :D
Manas Shukla
@noodlecaake_twitter
I am looking for some well documented use case of using machine learning guided by joern (like the one done by Fabian himself in his phd thesis chapter 7, https://ediss.uni-goettingen.de/bitstream/handle/11858/00-1735-0000-0023-9682-0/mainFastWeb.pdf)
Jai Verma
@jaiverma
@itsacoderepo, lol 😂
Manas Shukla
@noodlecaake_twitter
Is there a way to get function body in C++ program
I tried cpg.method.name.l
i just need those functions which are defined in that particular file
Jai Verma
@jaiverma
@noodlecaake_twitter, you can do the following to get methods in a file:
cpg.method.l.filter(_.location.filename.contains("test.c"))     // replace 'test.c' with the file you're interested in
by function body, do you mean source code of the function?
for that you can do:
cpg.method.name("main").dump
Manas Shukla
@noodlecaake_twitter
@jaiverma This gives me function declaration as well
i need to apply another filter where lineNumber != lineNumberEnd
I just needed function definitions
Manas Shukla
@noodlecaake_twitter
One more question
Is there any way to convert cpg into embeddings
XindaW
@XindaW
It seems that cpg.runScript("general/cfgToDot.sc") cannot output all the control flows for my files. Is my file too big? How should I fix this?
XindaW
@XindaW
I wonder why Joern cannot recognize the following function:
STACK_OF(X509) X509_chain_up_ref(STACK_OF(X509) chain)
{
STACK_OF(X509) ret;
int i;
ret = sk_X509_dup(chain);
if (ret == NULL)
return NULL;
for (i = 0; i < sk_X509_num(ret); i++) {
X509
x = sk_X509_value(ret, i);
if (!X509_up_ref(x))
goto err;
}
return ret;
err:
while (i-- > 0)
X509_free (sk_X509_value(ret, i));
sk_X509_free(ret);
return NULL;
}
Manas Shukla
@noodlecaake_twitter
Need help with this question
octopus-platform/joern#162
Manas Shukla
@noodlecaake_twitter
234235235
@234235235
Hey, how do i get the declaration of a buffer in order to check if sizeof has been used? e.g.
int test(char* s) { int size = sizeof(s); char buf[size+1]; strcpy(buf, s); }
234235235
@234235235
i know that snk.reachableByFlows(src) is true, i.e.
src=cpg.method.name("test").paramter
snk=cpg.method.name("test").callOut.name("strcpy")
now i want to check if the first argument to strcpy was initialized with correct size or with a constant
i.e. s.th. like cpg.call.lineNumber(7).argument.head....reachableBy(size) s.th. like that so i know that at least sizeof has been called written to size and then used to initialize the buffer
Jai Verma
@jaiverma
@234235235, this should work for your example
def buf = cpg.identifier.name("buf")
val buf_type = buf.evalType.head

// get identifiers which are reachable by sizeof
val idents = cpg.identifier.whereNonEmpty(
        _.reachableBy(cpg.call.name("<operator>.sizeOf"))
    )

// check if any of these identifiers were used in declaration
// of buf
idents.l.filter(buf_type.contains(_.name))

although this isn't a very good approach since we're just doing string comparison on the data type. here char buf[size+1] will have an evalType of char[size+1].

since we're just doing string comparison, you'll get false matches if any identifier has a name which is a substring of the type.

AlexCrosby
@AlexCrosby
Is it still possible to generate AST trees? Or is that only in the old version
AlexCrosby
@AlexCrosby
Infact a better question would be, can you use the old joern-tools on new joern
AlexCrosby
@AlexCrosby
Thanks, with cpg.method.name("cource").ast.l, is there any way to include which nodes are connected to which in the given list?
AlexCrosby
@AlexCrosby
Also I think there is an issue with piping the output of cpg.method.name("name").ast.l to a file. I'll raise it on github later
clccc
@clccc
In the old joern, I can traverse the CPG follow different types of edges, for example : g.v(123).outE; g.v(123).in . How can I do these similiar jobs on the new joern? Thanks ! @fabsx00 @mpollmeier @itsacoderepo
clccc
@clccc
Second Q: Can users write sercurity policies for Joern now or in the future?
Manas Shukla
@noodlecaake_twitter
Has anyone applied Spektral (Graph Neural Network library) on Code Property Graph ever ? Is it even possible ?
Manas Shukla
@noodlecaake_twitter
I have a dateset of C/C++ functions labelled as being vulnerable or not for 5 type of vulnerabilities.
I have generated code property graph for each function separately and now i want to train a model on code property graph with given labels
clccc
@clccc
Is there any commond/way to manage users' scripts in the joern?
I run the script with "cpg.runScript("the absolute path of my script")", I must give the absolute path of my script, it is a little trouble. Am I using joern wrong?
jaylen
@BingSlient

Has anyone applied Spektral (Graph Neural Network library) on Code Property Graph ever ? Is it even possible ?

It's possible with other graph neural network framework, like pytorch-geometric or DGL, there have been some work on this problem. So I guess it wouldn't be big deal.

Frederik Wenigwieser
@ain101
Is it possible to merge cpgs? I want to change a .c file without parsing the whole kernel again.
clccc
@clccc
If the id of a “memcpy” callee is 6676, how can i get its argument list? "cpg.id(6676).argument" and "cpg.id(6676).cast[Call].argument" is wrong. Thanks!
Jai Verma
@jaiverma
@clccc you could do:
cpg.call.l.filter(_.id == 6676).start.argument
Claudiu-Vlad Ursache
@ursachec
A similar query to what @jaiverma proposed is done using where:
cpg.call.where(_.id == 6676).argument.l
@noodlecaake_twitter That's without doubt possible, most of the work is in actually getting the Code Property Graph exported in a format that can be used by your machine learning algorithm. F-Secure has a team that does some tangential security work (though without a CPG involved) that might guide an idea or two: https://blog.f-secure.com/command-lines/
As to the CPG export, you can always use toJson or toJsonPretty on any queries you run to help you out
clccc
@clccc
Thanks @jaiverma @ursachec , it works.
Jai Verma
@jaiverma

Hello everyone, I have a question.

typedef struct {
    int num;
    void *ptr;
} arg_t;

void f(arg_t a) {
    int x = a.num;
    void* y = a.ptr;

    x = ntohl(x);
    printf("%d\n", x);
}

I am interested in finding flows from a struct field to any identifier. For example, flow from the void* ptr field of the arg_t struct to any local variable of the function f.

I am not able to find a way to search using field level granularity. My results contain all fields of the struct.

val src = cpg.method.name("f").parameter.filter(_.typ.name("arg_t"))
val sink = cpg.method.name("f").local.referencingIdentifiers
joern> sink.reachableByFlows(src).p

res5: List[String] = List(
  """________________________________________________________________________________________________________
| tracked          | lineNumber| method| file                                                           |
|=======================================================================================================|
| f(arg_t a)       | 9         | f     | /local/mnt/workspace/playground/joern_test/struct_access/main.c|
| x = a.num        | 10        | f     | /local/mnt/workspace/playground/joern_test/struct_access/main.c|
| ntohl(x)         | 13        | f     | /local/mnt/workspace/playground/joern_test/struct_access/main.c|
| x = ntohl(x)     | 13        | f     | /local/mnt/workspace/playground/joern_test/struct_access/main.c|
| printf("%d\n", x)| 14        | f     | /local/mnt/workspace/playground/joern_test/struct_access/main.c|
""",
...
  """__________________________________________________________________________________________________
| tracked    | lineNumber| method| file                                                           |
|=================================================================================================|
| f(arg_t a) | 9         | f     | /local/mnt/workspace/playground/joern_test/struct_access/main.c|
| * y = a.ptr| 11        | f     | /local/mnt/workspace/playground/joern_test/struct_access/main.c|
""",

I tried to filter using FieldIdentifier but did not get any results.

val src = cpg.method.name("f").ast.isFieldIdentifier
val sink = cpg.method.name("f").local.referencingIdentifiers

joern> sink.reachableByFlows(src).p

res16: List[String] = List()

I am resorting to this stupid hack for now which checks to see if the name of that field is in the path (but I'm sure this is far from foolproof).

def validate(flow: Steps[Path]) = {
    flow.l.filter(
        _.elements.filter(
            _.ast.isFieldIdentifier
            .where(
                _.canonicalName == "ptr"
            ).size > 0
        ).size > 0
    ).start
}

Also, I think FieldIdentifier does not have any data type information either, so I can't filter on data type of the field. This is what I get when I run joern-parse:

2020-07-08 12:10:24.077 WARN MemberAccessLinker: Could not find type member. type=arg_t, member=ptr
2020-07-08 12:10:24.080 WARN MemberAccessLinker: Could not find type member. type=arg_t, member=num