Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
xiaotianming
@xiaotianming
@ursachec When my project contains a lot of files and I want to generate Cpg one by one, Joern's speed is too slow. When I generate Cpg from the project at once, the speed will be very fast.Is there any way to improve the separate analysis Speed?
Claudiu-Vlad Ursache
@ursachec
@xiaotianming triggering an analysis has some ramp-up time, so if you trigger multiple on small inputs, they may end up costing more time than a single large one. If you want to generate a large amount of CPGs using joern, then you might have to set up your own data processing pipeline, maybe using custom scripts (https://docs.joern.io/interpreter)
xiaotianming
@xiaotianming
Thank you !@ursachec
Nikita Mehrotra
@nikitamehrotra12
Hi, I'm a new Joern user. I was exporting the generated CPG14 to a dot file...but while using joern-export command I am getting error -> "command not found"
damaoooo
@damaoooo
Hi, How can i use export Joern CPG into (node.csv, edge.csv) or other file format which neo4j can read it can how can I export the three into python? I found that in old version of joern and neo4j, It is sure that change the data path of neo4j can do that, but in new version of joern or in new version of neo4j, that can't be done. So how can I export the CPG14 in neo4j and python?
m1cm1c
@m1cm1c
according to "Modeling and Discovering Vulnerabilities with Code Property Graphs", control flow edges need to be labeled: "While these edges need not be ordered as in the
case of the abstract syntax trees, it is necessary to assign a label of true, false or ε to each edge." how can these labels be accessed in joern? i assumed that edge labels are modeled as edge properties. but i cannot find a single control flow edge with any properties
sweetchuck8481
@sweetchuck8481
grafik.png
Hey guys, I installed joern today and encountered the same problem as @colorlight while trying the stuff from your documentation.
I am running a VM with Ubuntu 16.04.5
Thanks in advance for looking into it!
Juilia F
@FJuilia_twitter
hey :) i'm just wondering how i can follow data dependency edges. i can see them when i export the DDG via joern-export. but i don't know what types of edges to look for when i'm in joern. can you help me, please?
sweetchuck8481
@sweetchuck8481
Hello again. I checked Version v1.1.55 and with that it worked fine. Maybe that information can help.
Claudiu-Vlad Ursache
@ursachec

hey @FJuilia_twitter! Joern features a step named ddgIn you can use to follow data dependency edges. For example, in the following program:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]) {
  if (argc > 1 && strcmp(argv[1], "42") == 0) {
    fprintf(stderr, "It depends!\n");
    exit(42);
  }
  printf("What is the meaning of life?\n");
  exit(0);
}

you can follow DDG edges for the call to strcmp like so:

joern> cpg.call.name("strcmp").ddgIn.l 
res103: List[nodes.TrackingPoint] = List(
  Literal(
    id -> 1000117L,
    code -> "0",
    order -> 2,
    argumentIndex -> 2,
    typeFullName -> "int",
    dynamicTypeHintFullName -> List(),
    lineNumber -> Some(6),
    columnNumber -> Some(43),
    depthFirstOrder -> None,
    internalFlags -> None
  ),
  MethodParameterIn(
    id -> 1000104L,
    code -> "char *argv[]",
    order -> 2,
    name -> "argv",
    evaluationStrategy -> "BY_VALUE",
    typeFullName -> "char * [ ]",
    dynamicTypeHintFullName -> List(),
    lineNumber -> Some(5),
    columnNumber -> Some(19)
  )
)
Claudiu-Vlad Ursache
@ursachec
Additionally, reachableBy might also help:
joern> cpg.call.name("strcmp").reachableBy(cpg.method.parameter).l 
res105: List[MethodParameterIn] = List(
  MethodParameterIn(
    id -> 1000104L,
    code -> "char *argv[]",
    order -> 2,
    name -> "argv",
    evaluationStrategy -> "BY_VALUE",
    typeFullName -> "char * [ ]",
    dynamicTypeHintFullName -> List(),
    lineNumber -> Some(5),
    columnNumber -> Some(19)
  )
)
Juilia F
@FJuilia_twitter

@ursachec thank you for your answer :) unfortunately, your solution does not seem to work. ddgIn always yields an empty list, including if i try your example. i also noticed that ddgOut does not exist:

joern> cpg.call.name("strcmp").ddgIn.l 
res59: List[nodes.TrackingPoint] = List()

joern> cpg.call.name("strcmp").l 
res60: List[Call] = List(
  Call(
    id -> 1000112L,
    code -> "strcmp(argv[1], \"42\")",
    name -> "strcmp",
    order -> 1,
    methodInstFullName -> None,
    methodFullName -> "strcmp",
    argumentIndex -> 1,
    dispatchType -> "STATIC_DISPATCH",
    signature -> "TODO assignment signature",
    typeFullName -> "ANY",
    dynamicTypeHintFullName -> List(),
    lineNumber -> Some(6),
    columnNumber -> Some(18),
    resolved -> None,
    depthFirstOrder -> None,
    internalFlags -> None
  )
)

joern> cpg.call.name("strcmp").ddgIn.l 
res61: List[nodes.TrackingPoint] = List()

joern> cpg.call.name("strcmp").ddgOut.l 
cmd62.sc:1: value ddgOut is not a member of overflowdb.traversal.Traversal[io.shiftleft.codepropertygraph.generated.nodes.Call]
val res62 = cpg.call.name("strcmp").ddgOut.l
                                    ^
Compilation Failed

if i try reachableBy, i also just get an empty list:

joern> cpg.call.name("strcmp").reachableBy(cpg.method.parameter).l 
res62: List[MethodParameterIn] = List()

is there a command that needs to be called first so that these commands work? like a command to build the DDG?

Claudiu-Vlad Ursache
@ursachec
@FJuilia_twitter Ah, right, forgot to mention, you have to run joern> run.ossdataflow first
Juilia F
@FJuilia_twitter

@ursachec thank you, it works now! :) however, i cannot re-create the data flow example of the paper "Modeling and Discovering Vulnerabilities with Code Property Graphs". the paper contains PDG

for this code:

void foo()
{
  int x = source();
  if (x < MAX)
    {
      int y = 2 * x;
      sink(y);
    }
}

i'm trying to prove using joern that there is data flow between int x = source() and sink(y). via ./joern-export --repr ddg --out outdir i get output that includes:

  "1000105" -> "1000109"  [ label = "x"] 
  "1000102" -> "1000109" 
  "1000116" -> "1000114"  [ label = "2"] 
  "1000116" -> "1000114"  [ label = "x"] 
  "1000102" -> "1000114" 
  "1000102" -> "1000116" 
  "1000109" -> "1000116"  [ label = "x"] 
  "1000114" -> "1000119"  [ label = "y"]

from this, i can see that 1000105 → 1000109 → 1000116 → 1000114 → 1000119 is a path. 1000105 is int x = source() and 1000119 is sink(y). this proves the data flow. now i want to re-create this in joern. because ddgOut does not seem to exist, i'm walking backwards (starting at the sink): https://pastebin.com/U8VHFBWD i eventually get to 1000106L which is the call to source() but i never get to the assignment call int x = source()

Juilia F
@FJuilia_twitter

reachableBy() does not seem to be the solution because that does not find the data flow either:

joern> cpg.call.id(1000105L).reachableBy(cpg.call.id(1000105L)).l 
res100: List[Call] = List(
  Call(
    id -> 1000105L,
    code -> "x = source()",
    name -> "<operator>.assignment",
    order -> 2,
    methodInstFullName -> None,
    methodFullName -> "<operator>.assignment",
    argumentIndex -> 2,
    dispatchType -> "STATIC_DISPATCH",
    signature -> "TODO assignment signature",
    typeFullName -> "ANY",
    dynamicTypeHintFullName -> List(),
    lineNumber -> Some(3),
    columnNumber -> Some(6),
    resolved -> None,
    depthFirstOrder -> None,
    internalFlags -> None
  )
)

joern> cpg.call.id(1000119L).reachableBy(cpg.call.id(1000119L)).l 
res101: List[Call] = List(
  Call(
    id -> 1000119L,
    code -> "sink(y)",
    name -> "sink",
    order -> 3,
    methodInstFullName -> None,
    methodFullName -> "sink",
    argumentIndex -> 3,
    dispatchType -> "STATIC_DISPATCH",
    signature -> "TODO assignment signature",
    typeFullName -> "ANY",
    dynamicTypeHintFullName -> List(),
    lineNumber -> Some(7),
    columnNumber -> Some(3),
    resolved -> None,
    depthFirstOrder -> None,
    internalFlags -> None
  )
)

joern> cpg.call.id(1000105L).reachableBy(cpg.call.id(1000119L)).l 
res102: List[Call] = List()

joern> cpg.call.id(1000119L).reachableBy(cpg.call.id(1000105L)).l 
res103: List[Call] = List()

i printed the reachability of the nodes to themselves first so you can be sure that i'm at the correct nodes. do you know why this doesn't work? :)

Claudiu-Vlad Ursache
@ursachec
@FJuilia_twitter I am not 100% certain, but I think the behavior you're seeing is because when you're referencing the call to the assigment operator, you're actually referring to the return value of that call, which in your case, is not part of the flow.
if you'd take the source as being the identifier x at line 3, you'd find a flow, and similarly for the call to source also at line 3
def source = cpg.identifier.lineNumber(3)
def sink = cpg.call.name("sink")
sink.reachableBy(source).l
Juilia F
@FJuilia_twitter
@ursachec i'll just always use .astChildrenthen. thank you very much! :)
Claudiu-Vlad Ursache
@ursachec
Glad I could help @FJuilia_twitter !
m1cm1c
@m1cm1c
hi, is it possible to unify two traversals more easily / more efficiently than by turning both of them into lists, concatenating the lists, and then feeding the concatenated lists into the Traversal constructor?
MeNicefellow
@MeNicefellow
Hi, just want to inquire anyone got any idea what is the best solution to convert the cpg.bin to json so that it could be loaded by python?
m1cm1c
@m1cm1c
@MeNicefellow it might be a better idea to export to the dot format: https://docs.joern.io/exporting/
MeNicefellow
@MeNicefellow
@m1cm1c Thanks man.
Anyone got any idea how to get the line number corresponds to a cpg node?
When I use joern-export to export it to dot files.
Alessandro Mantovani
@elManto
Hi! What's the best way to log the results of a query in a file?
Claudiu-Vlad Ursache
@ursachec
@elManto you can use the |> operator, e.g. cpg.method.fullName.l |> "my-fullnames.txt"
Juilia F
@FJuilia_twitter
hey :) i'm trying to distinguish write access from read access. is there a way of finding out whether a specific local variable gets written to? preferably also with some location of where (e.g. which of its identifiers or what call is used)
Claudiu-Vlad Ursache
@ursachec
@FJuilia_twitter it depends what you mean by gets written to. For assignments like x = 2, you can search the graph for CALL nodes with the assignment operator as their method, e.g. cpg.call.methodFullName(Operators.assignment).l. If you're looking for byte-copying stdlib functions with a specific variable as argument, you would search for cpg.call.code(".*strcpy.*").where(_.argument.codeExact("x")) . Other steps from the reference card might be helpful https://docs.joern.io/cpgql/reference-card
Juilia F
@FJuilia_twitter
@ursachec thanks for the ideas. i'm mostly concerned about writes through operators, not through functions that just happen to perform a write. but there are many ways operators can write. i can think of =, +=, -=, *=, /=, %=, |=, &=, ^=, <<=, >>=, var++, ++var, var--, and --var. but there might be more. i think that arrays and structs further complicate things. is there a universal way of detecting writes, at least as far as operators are concerned?
Niko Schmidt
@itsacoderepo
@FJuilia_twitter maybe there is a misunderstanding here. You can do:
call.png
So you can think of the method "assigment". If i stick to the code in the screenshot, it is =(res,crypto_scalarmult((unsigned char *)q, (unsigned char *)n, (unsigned char *)p).
Juilia F
@FJuilia_twitter
@itsacoderepo thanks but i know that operators are implemented as calls. i was wondering whether there is something built-in that finds all calls that definitely perform a write. shortly before you answered, i gave up on finding it and am now using a filter: .filter(node => node.property("NAME") != null && (Array("<operator>.preIncrement", "<operator>.postIncrement", "<operator>.preDecrement", "<operator>.postDecrement").toList.contains(node.property("NAME").toString) || node.property("NAME").toString.slice(0, 21).equals("<operator>.assignment")))
Niko Schmidt
@itsacoderepo

@itsacoderepo thanks but i know that operators are implemented as calls.

Then i misunderstood your question.

operators.png
you can use regex to get the methods you want ^
Juilia F
@FJuilia_twitter
@itsacoderepo oh, thank you for the hint! yes, that's much easier :)
Niko Schmidt
@itsacoderepo
From there you can go to the calls:
joern> cpg.method.name("<operator>.*").callIn.head 
res12: Call = Call(
  id -> 1000882L,
  code -> "--pos",
  name -> "<operator>.preDecrement",
  order -> 3,
  methodInstFullName -> None,
  methodFullName -> "<operator>.preDecrement",
  argumentIndex -> 3,
  dispatchType -> "STATIC_DISPATCH",
  signature -> "TODO assignment signature",
  typeFullName -> "ANY",
  dynamicTypeHintFullName -> List(),
  lineNumber -> Some(value = 189),
  columnNumber -> Some(value = 26),
  resolved -> None,
  depthFirstOrder -> None,
  internalFlags -> None
)
or
joern> val myOperators = List("<operator>.preDecrement", "<operator>.assignment") 
myOperators: List[String] = List("<operator>.preDecrement", "<operator>.assignment")

joern> cpg.method.name(myOperators:_*).name.p 
res18: List[String] = List("<operator>.preDecrement", "<operator>.assignment")
I personally like to create a list of interesting methods at the beginning of a script and use it as "var arg" later on, like myOperators:_*
Juilia F
@FJuilia_twitter
okay, i'll try that :) thank you
Niko Schmidt
@itsacoderepo
np
hyunji-Hong
@hyunji-Hong

hi! I'm a starter of Joern, and I have difficulty connecting Joern server mode. (./joern --server).
I want to connect my VM server(Ubuntu) with my local pc(MacOs). 

(I turn on the joern server in my vm server and try to access the server through python in local PC,MacOS)
But, when I ran my python program, the program failed due to a connection error.

Here are some of the details:

[ip info]
vmware ubuntu(NAT): 172.16.191.2

[error message]
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/asyncio/selector_events.py", line 526, in _sock_connect_cb
raise OSError(err, f'Connect call failed {address}')
ConnectionRefusedError: [Errno 61] Connect call failed ('172.16.191.2', 8080)

[in my python program code]
server_endpoint = "172.16.191.2:8080" //following github(cpgqls-client) example

[More info]
1) I checked vm netstat when I turned on Joern server, and I saw that port 8080 port is open.
2) I checked connection between vm and local PC, and it’s ok(checking through ping)
3) I checked tcpdump in local PC, when local PC access to VM Joern server Port, it returns RST packet, so the connection failed.

So… is there a solution about this issue?

xshub
@xshub
Hi, I want to run a script to extract PDG/AST/CFG and save it to a JSON file by using Joern . I fine-tuning the "graph-for-funcs.sc" of the old version Joern, and it can work. But the result missing a lot of information compared to the result which extracting by Joern Shell Command (e.g. "cpg.method(xx).dotPdg.l").