hey @FJuilia_twitter! Joern features a step named ddgIn
you can use to follow data dependency edges. For example, in the following program:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[]) {
if (argc > 1 && strcmp(argv[1], "42") == 0) {
fprintf(stderr, "It depends!\n");
exit(42);
}
printf("What is the meaning of life?\n");
exit(0);
}
you can follow DDG edges for the call to strcmp like so:
joern> cpg.call.name("strcmp").ddgIn.l
res103: List[nodes.TrackingPoint] = List(
Literal(
id -> 1000117L,
code -> "0",
order -> 2,
argumentIndex -> 2,
typeFullName -> "int",
dynamicTypeHintFullName -> List(),
lineNumber -> Some(6),
columnNumber -> Some(43),
depthFirstOrder -> None,
internalFlags -> None
),
MethodParameterIn(
id -> 1000104L,
code -> "char *argv[]",
order -> 2,
name -> "argv",
evaluationStrategy -> "BY_VALUE",
typeFullName -> "char * [ ]",
dynamicTypeHintFullName -> List(),
lineNumber -> Some(5),
columnNumber -> Some(19)
)
)
reachableBy
might also help:joern> cpg.call.name("strcmp").reachableBy(cpg.method.parameter).l
res105: List[MethodParameterIn] = List(
MethodParameterIn(
id -> 1000104L,
code -> "char *argv[]",
order -> 2,
name -> "argv",
evaluationStrategy -> "BY_VALUE",
typeFullName -> "char * [ ]",
dynamicTypeHintFullName -> List(),
lineNumber -> Some(5),
columnNumber -> Some(19)
)
)
@ursachec thank you for your answer :) unfortunately, your solution does not seem to work. ddgIn
always yields an empty list, including if i try your example. i also noticed that ddgOut
does not exist:
joern> cpg.call.name("strcmp").ddgIn.l
res59: List[nodes.TrackingPoint] = List()
joern> cpg.call.name("strcmp").l
res60: List[Call] = List(
Call(
id -> 1000112L,
code -> "strcmp(argv[1], \"42\")",
name -> "strcmp",
order -> 1,
methodInstFullName -> None,
methodFullName -> "strcmp",
argumentIndex -> 1,
dispatchType -> "STATIC_DISPATCH",
signature -> "TODO assignment signature",
typeFullName -> "ANY",
dynamicTypeHintFullName -> List(),
lineNumber -> Some(6),
columnNumber -> Some(18),
resolved -> None,
depthFirstOrder -> None,
internalFlags -> None
)
)
joern> cpg.call.name("strcmp").ddgIn.l
res61: List[nodes.TrackingPoint] = List()
joern> cpg.call.name("strcmp").ddgOut.l
cmd62.sc:1: value ddgOut is not a member of overflowdb.traversal.Traversal[io.shiftleft.codepropertygraph.generated.nodes.Call]
val res62 = cpg.call.name("strcmp").ddgOut.l
^
Compilation Failed
if i try reachableBy
, i also just get an empty list:
joern> cpg.call.name("strcmp").reachableBy(cpg.method.parameter).l
res62: List[MethodParameterIn] = List()
is there a command that needs to be called first so that these commands work? like a command to build the DDG?
@ursachec thank you, it works now! :) however, i cannot re-create the data flow example of the paper "Modeling and Discovering Vulnerabilities with Code Property Graphs". the paper contains PDG
for this code:
void foo()
{
int x = source();
if (x < MAX)
{
int y = 2 * x;
sink(y);
}
}
i'm trying to prove using joern that there is data flow between int x = source()
and sink(y)
. via ./joern-export --repr ddg --out outdir
i get output that includes:
"1000105" -> "1000109" [ label = "x"]
"1000102" -> "1000109"
"1000116" -> "1000114" [ label = "2"]
"1000116" -> "1000114" [ label = "x"]
"1000102" -> "1000114"
"1000102" -> "1000116"
"1000109" -> "1000116" [ label = "x"]
"1000114" -> "1000119" [ label = "y"]
from this, i can see that 1000105 → 1000109 → 1000116 → 1000114 → 1000119 is a path. 1000105 is int x = source()
and 1000119 is sink(y)
. this proves the data flow. now i want to re-create this in joern. because ddgOut
does not seem to exist, i'm walking backwards (starting at the sink): https://pastebin.com/U8VHFBWD i eventually get to 1000106L which is the call to source()
but i never get to the assignment call int x = source()
reachableBy()
does not seem to be the solution because that does not find the data flow either:
joern> cpg.call.id(1000105L).reachableBy(cpg.call.id(1000105L)).l
res100: List[Call] = List(
Call(
id -> 1000105L,
code -> "x = source()",
name -> "<operator>.assignment",
order -> 2,
methodInstFullName -> None,
methodFullName -> "<operator>.assignment",
argumentIndex -> 2,
dispatchType -> "STATIC_DISPATCH",
signature -> "TODO assignment signature",
typeFullName -> "ANY",
dynamicTypeHintFullName -> List(),
lineNumber -> Some(3),
columnNumber -> Some(6),
resolved -> None,
depthFirstOrder -> None,
internalFlags -> None
)
)
joern> cpg.call.id(1000119L).reachableBy(cpg.call.id(1000119L)).l
res101: List[Call] = List(
Call(
id -> 1000119L,
code -> "sink(y)",
name -> "sink",
order -> 3,
methodInstFullName -> None,
methodFullName -> "sink",
argumentIndex -> 3,
dispatchType -> "STATIC_DISPATCH",
signature -> "TODO assignment signature",
typeFullName -> "ANY",
dynamicTypeHintFullName -> List(),
lineNumber -> Some(7),
columnNumber -> Some(3),
resolved -> None,
depthFirstOrder -> None,
internalFlags -> None
)
)
joern> cpg.call.id(1000105L).reachableBy(cpg.call.id(1000119L)).l
res102: List[Call] = List()
joern> cpg.call.id(1000119L).reachableBy(cpg.call.id(1000105L)).l
res103: List[Call] = List()
i printed the reachability of the nodes to themselves first so you can be sure that i'm at the correct nodes. do you know why this doesn't work? :)
x
at line 3, you'd find a flow, and similarly for the call to source
also at line 3
def source = cpg.identifier.lineNumber(3)
def sink = cpg.call.name("sink")
sink.reachableBy(source).l
x = 2
, you can search the graph for CALL nodes with the assignment operator as their method, e.g. cpg.call.methodFullName(Operators.assignment).l
. If you're looking for byte-copying stdlib functions with a specific variable as argument, you would search for cpg.call.code(".*strcpy.*").where(_.argument.codeExact("x"))
. Other steps from the reference card might be helpful https://docs.joern.io/cpgql/reference-card
=(res,crypto_scalarmult((unsigned char *)q, (unsigned char *)n, (unsigned char *)p)
.
.filter(node => node.property("NAME") != null && (Array("<operator>.preIncrement", "<operator>.postIncrement", "<operator>.preDecrement", "<operator>.postDecrement").toList.contains(node.property("NAME").toString) || node.property("NAME").toString.slice(0, 21).equals("<operator>.assignment")))
joern> cpg.method.name("<operator>.*").callIn.head
res12: Call = Call(
id -> 1000882L,
code -> "--pos",
name -> "<operator>.preDecrement",
order -> 3,
methodInstFullName -> None,
methodFullName -> "<operator>.preDecrement",
argumentIndex -> 3,
dispatchType -> "STATIC_DISPATCH",
signature -> "TODO assignment signature",
typeFullName -> "ANY",
dynamicTypeHintFullName -> List(),
lineNumber -> Some(value = 189),
columnNumber -> Some(value = 26),
resolved -> None,
depthFirstOrder -> None,
internalFlags -> None
)
joern> val myOperators = List("<operator>.preDecrement", "<operator>.assignment")
myOperators: List[String] = List("<operator>.preDecrement", "<operator>.assignment")
joern> cpg.method.name(myOperators:_*).name.p
res18: List[String] = List("<operator>.preDecrement", "<operator>.assignment")
myOperators:_*
hi! I'm a starter of Joern, and I have difficulty connecting Joern server mode. (./joern --server).
I want to connect my VM server(Ubuntu) with my local pc(MacOs).
(I turn on the joern server in my vm server and try to access the server through python in local PC,MacOS)
But, when I ran my python program, the program failed due to a connection error.
Here are some of the details:
[ip info]
vmware ubuntu(NAT): 172.16.191.2
[error message]
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/asyncio/selector_events.py", line 526, in _sock_connect_cb
raise OSError(err, f'Connect call failed {address}')
ConnectionRefusedError: [Errno 61] Connect call failed ('172.16.191.2', 8080)
[in my python program code]server_endpoint = "172.16.191.2:8080" //following github(cpgqls-client) example
[More info]
1) I checked vm netstat when I turned on Joern server, and I saw that port 8080 port is open.
2) I checked connection between vm and local PC, and it’s ok(checking through ping)
3) I checked tcpdump in local PC, when local PC access to VM Joern server Port, it returns RST packet, so the connection failed.
So… is there a solution about this issue?
val cfgChildren = method.out(EdgeTypes.CFG).asScala.collect { case node: nodes.CfgNode => node }.toList
// val local = new NodeSteps(
val local = new Traversal(
//methodVertex
method
.out(EdgeTypes.CONTAINS)
.hasLabel(NodeTypes.BLOCK)
.out(EdgeTypes.AST)
.hasLabel(NodeTypes.LOCAL)
.cast[nodes.Local])
val sink = local.evalType(".*").referencingIdentifiers.dedup
//val source = new NodeSteps(methodVertex.out(EdgeTypes.CONTAINS).hasLabel(NodeTypes.CALL).cast[nodes.Call]).nameNot("<operator>.*").dedup
val source = new Traversal(method.out(EdgeTypes.CONTAINS).hasLabel(NodeTypes.CALL).cast[nodes.Call]).nameNot("<operator>.*").dedup
val pdgChildren = sink
.reachableByFlows(source)
.l
.flatMap { path =>
path.elements
.map {
case trackingPoint @ (_: MethodParameterIn) => trackingPoint.start.method.head
case trackingPoint => trackingPoint.cfgNode
}
}
.filter(_.toString != methodId)
GraphForFuncsFunction(methodName, methodFile, methodId, astChildren, cfgChildren, pdgChildren.distinct)
int
main(int argc, char *argv[]) {
int eaten = atoi(argv[1]);
int value ;
if (!strcmp(argv[1]), "drink") {
eaten += 1;
value = eaten * 3;
} else {
value = eaten;
}
return value;
}