Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
sweetchuck8481
@sweetchuck8481
Thanks in advance for looking into it!
Juilia F
@FJuilia_twitter
hey :) i'm just wondering how i can follow data dependency edges. i can see them when i export the DDG via joern-export. but i don't know what types of edges to look for when i'm in joern. can you help me, please?
sweetchuck8481
@sweetchuck8481
Hello again. I checked Version v1.1.55 and with that it worked fine. Maybe that information can help.
Claudiu-Vlad Ursache
@ursachec

hey @FJuilia_twitter! Joern features a step named ddgIn you can use to follow data dependency edges. For example, in the following program:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]) {
  if (argc > 1 && strcmp(argv[1], "42") == 0) {
    fprintf(stderr, "It depends!\n");
    exit(42);
  }
  printf("What is the meaning of life?\n");
  exit(0);
}

you can follow DDG edges for the call to strcmp like so:

joern> cpg.call.name("strcmp").ddgIn.l 
res103: List[nodes.TrackingPoint] = List(
  Literal(
    id -> 1000117L,
    code -> "0",
    order -> 2,
    argumentIndex -> 2,
    typeFullName -> "int",
    dynamicTypeHintFullName -> List(),
    lineNumber -> Some(6),
    columnNumber -> Some(43),
    depthFirstOrder -> None,
    internalFlags -> None
  ),
  MethodParameterIn(
    id -> 1000104L,
    code -> "char *argv[]",
    order -> 2,
    name -> "argv",
    evaluationStrategy -> "BY_VALUE",
    typeFullName -> "char * [ ]",
    dynamicTypeHintFullName -> List(),
    lineNumber -> Some(5),
    columnNumber -> Some(19)
  )
)
Claudiu-Vlad Ursache
@ursachec
Additionally, reachableBy might also help:
joern> cpg.call.name("strcmp").reachableBy(cpg.method.parameter).l 
res105: List[MethodParameterIn] = List(
  MethodParameterIn(
    id -> 1000104L,
    code -> "char *argv[]",
    order -> 2,
    name -> "argv",
    evaluationStrategy -> "BY_VALUE",
    typeFullName -> "char * [ ]",
    dynamicTypeHintFullName -> List(),
    lineNumber -> Some(5),
    columnNumber -> Some(19)
  )
)
Juilia F
@FJuilia_twitter

@ursachec thank you for your answer :) unfortunately, your solution does not seem to work. ddgIn always yields an empty list, including if i try your example. i also noticed that ddgOut does not exist:

joern> cpg.call.name("strcmp").ddgIn.l 
res59: List[nodes.TrackingPoint] = List()

joern> cpg.call.name("strcmp").l 
res60: List[Call] = List(
  Call(
    id -> 1000112L,
    code -> "strcmp(argv[1], \"42\")",
    name -> "strcmp",
    order -> 1,
    methodInstFullName -> None,
    methodFullName -> "strcmp",
    argumentIndex -> 1,
    dispatchType -> "STATIC_DISPATCH",
    signature -> "TODO assignment signature",
    typeFullName -> "ANY",
    dynamicTypeHintFullName -> List(),
    lineNumber -> Some(6),
    columnNumber -> Some(18),
    resolved -> None,
    depthFirstOrder -> None,
    internalFlags -> None
  )
)

joern> cpg.call.name("strcmp").ddgIn.l 
res61: List[nodes.TrackingPoint] = List()

joern> cpg.call.name("strcmp").ddgOut.l 
cmd62.sc:1: value ddgOut is not a member of overflowdb.traversal.Traversal[io.shiftleft.codepropertygraph.generated.nodes.Call]
val res62 = cpg.call.name("strcmp").ddgOut.l
                                    ^
Compilation Failed

if i try reachableBy, i also just get an empty list:

joern> cpg.call.name("strcmp").reachableBy(cpg.method.parameter).l 
res62: List[MethodParameterIn] = List()

is there a command that needs to be called first so that these commands work? like a command to build the DDG?

Claudiu-Vlad Ursache
@ursachec
@FJuilia_twitter Ah, right, forgot to mention, you have to run joern> run.ossdataflow first
Juilia F
@FJuilia_twitter

@ursachec thank you, it works now! :) however, i cannot re-create the data flow example of the paper "Modeling and Discovering Vulnerabilities with Code Property Graphs". the paper contains PDG

for this code:

void foo()
{
  int x = source();
  if (x < MAX)
    {
      int y = 2 * x;
      sink(y);
    }
}

i'm trying to prove using joern that there is data flow between int x = source() and sink(y). via ./joern-export --repr ddg --out outdir i get output that includes:

  "1000105" -> "1000109"  [ label = "x"] 
  "1000102" -> "1000109" 
  "1000116" -> "1000114"  [ label = "2"] 
  "1000116" -> "1000114"  [ label = "x"] 
  "1000102" -> "1000114" 
  "1000102" -> "1000116" 
  "1000109" -> "1000116"  [ label = "x"] 
  "1000114" -> "1000119"  [ label = "y"]

from this, i can see that 1000105 → 1000109 → 1000116 → 1000114 → 1000119 is a path. 1000105 is int x = source() and 1000119 is sink(y). this proves the data flow. now i want to re-create this in joern. because ddgOut does not seem to exist, i'm walking backwards (starting at the sink): https://pastebin.com/U8VHFBWD i eventually get to 1000106L which is the call to source() but i never get to the assignment call int x = source()

Juilia F
@FJuilia_twitter

reachableBy() does not seem to be the solution because that does not find the data flow either:

joern> cpg.call.id(1000105L).reachableBy(cpg.call.id(1000105L)).l 
res100: List[Call] = List(
  Call(
    id -> 1000105L,
    code -> "x = source()",
    name -> "<operator>.assignment",
    order -> 2,
    methodInstFullName -> None,
    methodFullName -> "<operator>.assignment",
    argumentIndex -> 2,
    dispatchType -> "STATIC_DISPATCH",
    signature -> "TODO assignment signature",
    typeFullName -> "ANY",
    dynamicTypeHintFullName -> List(),
    lineNumber -> Some(3),
    columnNumber -> Some(6),
    resolved -> None,
    depthFirstOrder -> None,
    internalFlags -> None
  )
)

joern> cpg.call.id(1000119L).reachableBy(cpg.call.id(1000119L)).l 
res101: List[Call] = List(
  Call(
    id -> 1000119L,
    code -> "sink(y)",
    name -> "sink",
    order -> 3,
    methodInstFullName -> None,
    methodFullName -> "sink",
    argumentIndex -> 3,
    dispatchType -> "STATIC_DISPATCH",
    signature -> "TODO assignment signature",
    typeFullName -> "ANY",
    dynamicTypeHintFullName -> List(),
    lineNumber -> Some(7),
    columnNumber -> Some(3),
    resolved -> None,
    depthFirstOrder -> None,
    internalFlags -> None
  )
)

joern> cpg.call.id(1000105L).reachableBy(cpg.call.id(1000119L)).l 
res102: List[Call] = List()

joern> cpg.call.id(1000119L).reachableBy(cpg.call.id(1000105L)).l 
res103: List[Call] = List()

i printed the reachability of the nodes to themselves first so you can be sure that i'm at the correct nodes. do you know why this doesn't work? :)

Claudiu-Vlad Ursache
@ursachec
@FJuilia_twitter I am not 100% certain, but I think the behavior you're seeing is because when you're referencing the call to the assigment operator, you're actually referring to the return value of that call, which in your case, is not part of the flow.
if you'd take the source as being the identifier x at line 3, you'd find a flow, and similarly for the call to source also at line 3
def source = cpg.identifier.lineNumber(3)
def sink = cpg.call.name("sink")
sink.reachableBy(source).l
Juilia F
@FJuilia_twitter
@ursachec i'll just always use .astChildrenthen. thank you very much! :)
Claudiu-Vlad Ursache
@ursachec
Glad I could help @FJuilia_twitter !
m1cm1c
@m1cm1c
hi, is it possible to unify two traversals more easily / more efficiently than by turning both of them into lists, concatenating the lists, and then feeding the concatenated lists into the Traversal constructor?
MeNicefellow
@MeNicefellow
Hi, just want to inquire anyone got any idea what is the best solution to convert the cpg.bin to json so that it could be loaded by python?
m1cm1c
@m1cm1c
@MeNicefellow it might be a better idea to export to the dot format: https://docs.joern.io/exporting/
MeNicefellow
@MeNicefellow
@m1cm1c Thanks man.
Anyone got any idea how to get the line number corresponds to a cpg node?
When I use joern-export to export it to dot files.
Alessandro Mantovani
@elManto
Hi! What's the best way to log the results of a query in a file?
Claudiu-Vlad Ursache
@ursachec
@elManto you can use the |> operator, e.g. cpg.method.fullName.l |> "my-fullnames.txt"
Juilia F
@FJuilia_twitter
hey :) i'm trying to distinguish write access from read access. is there a way of finding out whether a specific local variable gets written to? preferably also with some location of where (e.g. which of its identifiers or what call is used)
Claudiu-Vlad Ursache
@ursachec
@FJuilia_twitter it depends what you mean by gets written to. For assignments like x = 2, you can search the graph for CALL nodes with the assignment operator as their method, e.g. cpg.call.methodFullName(Operators.assignment).l. If you're looking for byte-copying stdlib functions with a specific variable as argument, you would search for cpg.call.code(".*strcpy.*").where(_.argument.codeExact("x")) . Other steps from the reference card might be helpful https://docs.joern.io/cpgql/reference-card
Juilia F
@FJuilia_twitter
@ursachec thanks for the ideas. i'm mostly concerned about writes through operators, not through functions that just happen to perform a write. but there are many ways operators can write. i can think of =, +=, -=, *=, /=, %=, |=, &=, ^=, <<=, >>=, var++, ++var, var--, and --var. but there might be more. i think that arrays and structs further complicate things. is there a universal way of detecting writes, at least as far as operators are concerned?
Niko Schmidt
@itsacoderepo
@FJuilia_twitter maybe there is a misunderstanding here. You can do:
call.png
So you can think of the method "assigment". If i stick to the code in the screenshot, it is =(res,crypto_scalarmult((unsigned char *)q, (unsigned char *)n, (unsigned char *)p).
Juilia F
@FJuilia_twitter
@itsacoderepo thanks but i know that operators are implemented as calls. i was wondering whether there is something built-in that finds all calls that definitely perform a write. shortly before you answered, i gave up on finding it and am now using a filter: .filter(node => node.property("NAME") != null && (Array("<operator>.preIncrement", "<operator>.postIncrement", "<operator>.preDecrement", "<operator>.postDecrement").toList.contains(node.property("NAME").toString) || node.property("NAME").toString.slice(0, 21).equals("<operator>.assignment")))
Niko Schmidt
@itsacoderepo

@itsacoderepo thanks but i know that operators are implemented as calls.

Then i misunderstood your question.

operators.png
you can use regex to get the methods you want ^
Juilia F
@FJuilia_twitter
@itsacoderepo oh, thank you for the hint! yes, that's much easier :)
Niko Schmidt
@itsacoderepo
From there you can go to the calls:
joern> cpg.method.name("<operator>.*").callIn.head 
res12: Call = Call(
  id -> 1000882L,
  code -> "--pos",
  name -> "<operator>.preDecrement",
  order -> 3,
  methodInstFullName -> None,
  methodFullName -> "<operator>.preDecrement",
  argumentIndex -> 3,
  dispatchType -> "STATIC_DISPATCH",
  signature -> "TODO assignment signature",
  typeFullName -> "ANY",
  dynamicTypeHintFullName -> List(),
  lineNumber -> Some(value = 189),
  columnNumber -> Some(value = 26),
  resolved -> None,
  depthFirstOrder -> None,
  internalFlags -> None
)
or
joern> val myOperators = List("<operator>.preDecrement", "<operator>.assignment") 
myOperators: List[String] = List("<operator>.preDecrement", "<operator>.assignment")

joern> cpg.method.name(myOperators:_*).name.p 
res18: List[String] = List("<operator>.preDecrement", "<operator>.assignment")
I personally like to create a list of interesting methods at the beginning of a script and use it as "var arg" later on, like myOperators:_*
Juilia F
@FJuilia_twitter
okay, i'll try that :) thank you
Niko Schmidt
@itsacoderepo
np
hyunji-Hong
@hyunji-Hong

hi! I'm a starter of Joern, and I have difficulty connecting Joern server mode. (./joern --server).
I want to connect my VM server(Ubuntu) with my local pc(MacOs). 

(I turn on the joern server in my vm server and try to access the server through python in local PC,MacOS)
But, when I ran my python program, the program failed due to a connection error.

Here are some of the details:

[ip info]
vmware ubuntu(NAT): 172.16.191.2

[error message]
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/asyncio/selector_events.py", line 526, in _sock_connect_cb
raise OSError(err, f'Connect call failed {address}')
ConnectionRefusedError: [Errno 61] Connect call failed ('172.16.191.2', 8080)

[in my python program code]
server_endpoint = "172.16.191.2:8080" //following github(cpgqls-client) example

[More info]
1) I checked vm netstat when I turned on Joern server, and I saw that port 8080 port is open.
2) I checked connection between vm and local PC, and it’s ok(checking through ping)
3) I checked tcpdump in local PC, when local PC access to VM Joern server Port, it returns RST packet, so the connection failed.

So… is there a solution about this issue?

xshub
@xshub
Hi, I want to run a script to extract PDG/AST/CFG and save it to a JSON file by using Joern . I fine-tuning the "graph-for-funcs.sc" of the old version Joern, and it can work. But the result missing a lot of information compared to the result which extracting by Joern Shell Command (e.g. "cpg.method(xx).dotPdg.l").
  val cfgChildren = method.out(EdgeTypes.CFG).asScala.collect { case node: nodes.CfgNode => node }.toList

  // val local = new NodeSteps(
  val local = new Traversal(
    //methodVertex
    method
      .out(EdgeTypes.CONTAINS)
      .hasLabel(NodeTypes.BLOCK)
      .out(EdgeTypes.AST)
      .hasLabel(NodeTypes.LOCAL)
      .cast[nodes.Local])
  val sink = local.evalType(".*").referencingIdentifiers.dedup
  //val source = new NodeSteps(methodVertex.out(EdgeTypes.CONTAINS).hasLabel(NodeTypes.CALL).cast[nodes.Call]).nameNot("<operator>.*").dedup
  val source = new Traversal(method.out(EdgeTypes.CONTAINS).hasLabel(NodeTypes.CALL).cast[nodes.Call]).nameNot("<operator>.*").dedup

  val pdgChildren = sink
    .reachableByFlows(source)
    .l
    .flatMap { path =>
      path.elements
        .map {
          case trackingPoint @ (_: MethodParameterIn) => trackingPoint.start.method.head
          case trackingPoint                          => trackingPoint.cfgNode
        }
    }
    .filter(_.toString != methodId)

  GraphForFuncsFunction(methodName, methodFile, methodId, astChildren, cfgChildren, pdgChildren.distinct)
xshub
@xshub
Why the result is different? Who can provide a new script to extract AST/CFG/PDG and save it to a JSON file . Thanks very much.
xshub
@xshub
@rasmusli_gitlab Hi, I also find the new "graph-for-funcs.sc". Can you share it?
Niko Schmidt
@itsacoderepo
@xshub please check the docs for exporting graphs https://docs.joern.io/exporting
vedkpl
@vedkpl
Hi, al
i have the following snippet:
int 
main(int argc, char *argv[]) {
        int eaten = atoi(argv[1]);
        int value ;

        if (!strcmp(argv[1]), "drink") {
                eaten += 1;
                value = eaten * 3;
        } else {
                value = eaten;
        }   

        return value;
}
i ran the following query:
cpg.returns.l(0).reachableByFlows(cpg.call("atoi")).l