Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Nov 26 02:04

    github-actions[bot] on master

    1077911 (compare)

  • Nov 26 01:12

    github-actions[bot] on v2.6.0-alpha

    (compare)

  • Nov 22 05:05
    andrewj-brown synchronize #1546
  • Nov 19 02:21

    github-actions[bot] on master

    1077911 (compare)

  • Nov 19 01:20

    github-actions[bot] on v2.6.0-alpha

    (compare)

  • Nov 14 15:16
    kit-ty-kate opened #1561
  • Nov 12 02:30

    github-actions[bot] on master

    1077911 (compare)

  • Nov 12 01:04

    github-actions[bot] on v2.6.0-alpha

    Corrected primus_dictionary_mai… Fix not propagating term attrs … (compare)

  • Nov 10 21:03
    bmourad01 synchronize #1559
  • Nov 10 20:14
    bmourad01 opened #1559
  • Nov 10 19:58

    ivg on master

    Fix not propagating term attrs … (compare)

  • Nov 10 19:58
    ivg closed #1556
  • Nov 08 18:37
    ivg closed #1558
  • Nov 08 18:36

    ivg on master

    Corrected primus_dictionary_mai… (compare)

  • Nov 05 02:25

    github-actions[bot] on master

    77d09fe (compare)

  • Oct 31 17:22
    A-Benlolo opened #1558
  • Oct 29 02:37

    github-actions[bot] on master

    77d09fe (compare)

  • Oct 29 01:04

    github-actions[bot] on v2.6.0-alpha

    (compare)

  • Oct 27 14:51
    bmourad01 synchronize #1556
  • Oct 22 02:41

    github-actions[bot] on master

    77d09fe (compare)

mpmfrans
@mpmfrans
thank you!
does this mean that determining a cutoff and timeout should be done for each binary?
because in this case we did not now at initially to increase.. so there is no cutoff and timeout that fits all binaries right?
Ivan Gotovchits
@ivg
In fact, I can't confirm this anymore, so it looks like that I was just lucky. Unfortunately, z3 is not really deterministic
mpmfrans
@mpmfrans
I have to read about Z3 and what it is :)
Ivan Gotovchits
@ivg

because in this case we did not now at initially to increase.. so there is no cutoff and timeout that fits all binaries right?

The bigger the better. The more cutoff the more time you will spend on the same branch (but you will progress forward slower), the more timeout the more complex formulas you will be able to solve, but you will again progress slower. So if you have an infinite amount of time you can put both to infinity.

mpmfrans
@mpmfrans
@ivg thank you so much for all your help!
Ivan Gotovchits
@ivg
I would suggest experimenting with your test suite. For timeouts, to my experience, 16 ms is probably not enough. But anything greater than 10 seconds is overkill. I would stick somewhere between one second and ten seconds. Concerning cutoff, it should be roughly proportional to the size of the input problem. If you're hashing a file with 1k bytes, it should be 1k times N, where N is some overhead coefficient due to compiler inefficiencies. Take it equal to 10 at least, so for 1k input set it to 10k
you're welcome)
mpmfrans
@mpmfrans
It the following is in the output, does this mean the symbolic executor is freaking out?
(executor-debug "46: Forked a new machine")
(executor-debug "46: SAT")
(executor-debug "46: &-desc = #xffffffffffffffff")
(executor-debug "46: *-desc = #xffffffffffffffff")
(executor-debug "46: \\\b-desc = #xffffffffffffffff")
(executor-debug  "46: \\\
                \n-desc = #xffffffffffffffff")
(executor-debug "46: \\\014-desc = #xffffffffffffffff")
(executor-debug "46: \\ -desc = #xffffffffffffffff")
(executor-debug "46: \\\127ELF\\\002\\\001\\\001-desc = #x0000000000000003")
(executor-debug "46: \\\127ELF\\\002\\\001\\\001-size = #x0000000000000000")
(executor-debug "46: \\\135-desc = #xffffffffffffffff")
(executor-debug "46: \\\136-desc = #xffffffffffffffff")
(executor-debug "46: \\\137-desc = #xffffffffffffffff")
(executor-debug "46: \\\148-desc = #xffffffffffffffff")
(executor-debug "46: \\\149-desc = #xffffffffffffffff")
(executor-debug "46: \\\152-desc = #xffffffffffffffff")
(executor-debug "46: \\\161-desc = #xffffffffffffffff")
(executor-debug "46: \\\167-desc = #xffffffffffffffff")
(executor-debug "46: \\\173-desc = #xffffffffffffffff")
(executor-debug "46: \\\177-desc = #xffffffffffffffff")
(executor-debug "46: \\\184-desc = #xffffffffffffffff")
(executor-debug "46: \\\185-desc = #xffffffffffffffff")
(executor-debug "46: \\\188-desc = #xffffffffffffffff")
(executor-debug "46: \\\197-desc = #xffffffffffffffff")
(executor-debug "46: \\\226-desc = #xffffffffffffffff")
(executor-debug "46: \\\231-desc = #xffffffffffffffff")
(executor-debug "46: \\\236-desc = #xffffffffffffffff")
(executor-debug "46: \\\237-desc = #xffffffffffffffff")
(executor-debug "46: \\\239-desc = #xffffffffffffffff")
(executor-debug "46: \\\240-desc = #xffffffffffffffff")
(executor-debug "46: \\\243-desc = #xffffffffffffffff")
(executor-debug "46: ]-desc = #xffffffffffffffff")
(executor-debug "46: _-desc = #xffffffffffffffff")
(executor-debug "46: _8-desc = #xffffffffffffffff")
(executor-debug "46: A-desc = #xffffffffffffffff")
(executor-debug "46: E-desc = #xffffffffffffffff")
(executor-debug "46: f-desc = #xffffffffffffffff")
(executor-debug "46: M-desc = #xffffffffffffffff")
(executor-debug "46: S-desc = #xffffffffffffffff")
(executor-debug "46: t-desc = #xffffffffffffffff")
(executor-debug "46: u-desc = #xffffffffffffffff")
(executor-debug "46: Z-desc = #xffffffffffffffff")
(executor-debug "46: mem[0x40C9] = #x01")
(executor-debug "46: mem[0x40CA] = #x01")
(executor-debug "46: mem[0x40CB] = #x01")
(executor-debug "46: mem[0x40CC] = #x01")
(executor-debug "46: mem[0x40CD] = #x01")
(executor-debug "46: mem[0x40CE] = #x01")
mpmfrans
@mpmfrans
I did change the code quite a bit. Instead of a hardcoded path (but SE doesnt do anything with it) I provide the input via argv. I also initialize the buffer a bit different and I check if the buffer is not NULL so malloc succeeded. Also, yesterday, the length of the for-loop for the hash was hardcoded. now It is variable. So probably I have to increase the cutoff and timeout again because the SE has to solve the length variable as well..

Yesterday it was:

for (i = 0; i < 40 && *str; ++str, ++i){
                gen_hash = (*str) + (gen_hash << 6) + (gen_hash << 16) - gen_hash;
        }

Now it is:

for (i = 0; i < len && *str; ++str, ++i){
                gen_hash = (*str) + (gen_hash << 6) + (gen_hash << 16) - gen_hash;
        }

The len variable is determined when a file is opened based upon what contents that file has

mpmfrans
@mpmfrans
jeeeeej, it worked :)
Ivan Gotovchits
@ivg
No, it is just generating random names for files)
Kenneth Adam Miller
@KennethAdamMiller
I hope that you aren't annoyed by me, but I have another question. I was reading up on the Theory library again, and I was wondering how that all fits in. I don't see a lisp theory plugin anywhere, and I wondered what use cases Theory was intended to support. I suppose I thought it might have facilitated some kind of dynamically acquired language. Like, say let the language form as a part of some computational engineering or machine learning. I am not sure that that is the right use case though. What do you have in store for Theory?
(this is just curiosity)
Ivan Gotovchits
@ivg
I am not, I am just not working at CMU anymore, so I can spend as much time on BAP as when it was my only job. The Theory library is the tagless-final program representation. I am surprised with your question, as we discussed this many times and you even read the draft paper and saw my presentation, where it is described. Maybe I totally misunderstood your question, though.
Kenneth Adam Miller
@KennethAdamMiller

Possibly I am rusty after some time of having not read those papers. I didn't know you weren't at CMU anymore.

I am familiar with your paper and tagless final style and the Theory documentation. I suppose I haven't exercised Theory as much as I have a lot of the rest of the BAP API. So, although I see it, I don't see it in use. Let's say you were familiar with a particular car part. Well, in that case, you could know what could be done with it from what it's functionality is. Theory seems really general with many use cases, so it's not quite like a specific car part. I just wasn't sure what your exact use case was.

Kenneth Adam Miller
@KennethAdamMiller
I think I get what it does, but perhaps my foresight is fuzzy. Seems like you could facilitate language meaning support. I suppose I think that because part of BAP's fundamental objectives is to support decoding to a programming language, so therefore decompilation.
mpmfrans
@mpmfrans
@ivg Ok so I did modify the harness. For the fuzzer I had to do the input for stdin otherwise it would freak out. Somehow, when I run the symbolic executor it outputs (what you mentioned earlier) that there is only one fork and the message stating that there is no more work to do. The hash function etc. is exactly the same, I only changed the way input files are handled. I am wondering what could be the problem now.. I tried increasing both the cutoff and timeout.
is it because its reading from stdin?
with getline(buf,len,stdin)
the variables from IO are turned into symbolic values right?
mpmfrans
@mpmfrans
Ok so yes it is due to the getline(buf,len,stdin). If i comment this line the symbolic executor is running.
But I need this piece of code in order for the program to work properly.
Maybe a weird remark.. because the symbolic executor is printing debug information.. is this read by the binary via IO?
mpmfrans
@mpmfrans
and therefore quits symbolically executing after only one fork?
In that case I should tell the symbolic executor not to do IO
Ok removing symbolic-stdio from --primus-lisp-load doesnt have any effect.
mpmfrans
@mpmfrans
Yes, everything is working. The SE gives me multiple solutions. The solutions do make the program branch the if-statement and output "access granted". The only question left is why the SE is not functioning properly (starting, fork one and quit) when I add the line getline(buf,len,stdin) to the program.
Matt Griffin
@matt.griffin_gitlab

Hi, I've been using BAP to analyse Windows binaries as follows:

bap ./binary.dll -dbir --optimization-level=3

After upgrading to BAP 2.5.0 yesterday (late to the game I know) I get the following error:

Failed to build the project: attribute "llvm:symbol-entry" didn't provide a
value for the field "value"

This is definitely new :(

For reference this is the output of file binary.dll

binary.dll: PE32+ executable (DLL) (GUI) x86-64, for MS Windows
Benjamin Mourad
@bmourad01
@mpmfrans For getline maybe there is no API and stub for it: https://github.com/BinaryAnalysisPlatform/bap/blob/master/plugins/api/api/c/posix.h
You can find these headers installed in ~/.opam/<my-switch>/share/bap/api/c/
Additionally you would want to stub out the functionality of getline using Primus Lisp
Try with the option --primus-print-observations=linker-unresolved and see if it does that when it tries to call getline
mpmfrans
@mpmfrans
Indeed there is no implementation for that function. For getchar() there is though.
Ivan Gotovchits
@ivg

Yes, everything is working. The SE gives me multiple solutions. The solutions do make the program branch the if-statement and output "access granted". The only question left is why the SE is not functioning properly (starting, fork one and quit) when I add the line getline(buf,len,stdin) to the program.

Probably because it covered everything. If you didn't have a branch then what do you expect? SE will not just reran randomly if there is no control-flow dependency on the symbolic input. Also, don't forget that SE will search not only for the bug that you want it to search, but for any other bug in your code. So make sure that you check the return codes of all functions.

And yes, indeed, getline is not supported
mpmfrans
@mpmfrans
Thanks again @ivg I implemented fgetc(stdin) and it is working properly. I am so excited about this field or research!
Matt Griffin
@matt.griffin_gitlab

Hi, I've been using BAP to analyse Windows binaries as follows:

bap ./binary.dll -dbir --optimization-level=3

After upgrading to BAP 2.5.0 yesterday (late to the game I know) I get the following error:

Failed to build the project: attribute "llvm:symbol-entry" didn't provide a
value for the field "value"

This is definitely new :(

For reference this is the output of file binary.dll

binary.dll: PE32+ executable (DLL) (GUI) x86-64, for MS Windows

I'm not sure how I managed to do this in the past as it fails with the same error in all the previous versions (2.2., 2.3., 2.4.*). I guess it has something to do with the llvm version?

Ivan Gotovchits
@ivg

@matt.griffin_gitlab, can you clean the caches (bap --cache-clean) and try again. Most likely it won't help, but just to be sure. If it will not help, then please create an issue and don't forget to inlude bap --llvm-version output. It would be nice, if you will attach the binary for reproduction. If it is not possible, then attach the output of bap specification binary.dll (you may redact sensitive data, if you need).

As a workaround, you can experiment with llvm versions. As a hacky workaround, you can remove llvm:symbol-entries, e.g.,

bap specification binary.dll | grep -v llvm:symbol-entry > binary.ogre
bap ./binary.dll --loader=./binary.ogre

though it may result in an incomplete disassembly. A more sophisticated workaround with some sed magic, is to add a dummy 0 value for each llvm:symbol-entry.

Matt Griffin
@matt.griffin_gitlab
Sure, I can do the above. FYI I ran the command using the BAP docker image in case the issue was with my machine and it succeeded!
Ivan Gotovchits
@ivg
ha, so it should be definitely something wrong with llvm
can you compare bap --llvm-version on your machine and in the docker container?
2 replies
Benjamin Mourad
@bmourad01
@ivg I think I observed the real bottleneck when loading large programs in Primus: https://github.com/BinaryAnalysisPlatform/bap/blob/master/plugins/primus_loader/primus_loader_basic.ml#L206
Though I'm not sure if we can do much to optimize it
Ivan Gotovchits
@ivg
@bmourad01, I have an idea how we can make it faster. We should turn into names only tids that correspond to blocks and subroutines. Because only those could be called via linker (individual defs and jumps are not allowed, so we shouldn't link them, or add them to the environment)
Benjamin Mourad
@bmourad01
What is the intended purpose of that init_names pass? I think the linking of the names is done elsewhere, but here it seems that the names are being made available in the Primus environment
Ivan Gotovchits
@ivg
So that you can do @malloc or %deadbeef and corresponding tids will be evaluated to their addresses and you can do transfer control flow to those addresses
mpmfrans
@mpmfrans
What is the default cutoff? For timeout its 16 ms.