Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    0xpfeffer
    @0xpfeffer
    Okay
    They are negative for open as well though
    (symbol-entry open -4194304 0 0)
    Ivan Gotovchits
    @ivg
    ok, it looks like that they could be negative now, starting from #794
    0xpfeffer
    @0xpfeffer
    So it's a different issue?
    Ivan Gotovchits
    @ivg
    anyway, we're investigating this issue right now. Looks like something new. We didn't notice it inhouse, as we use bap-ida integration for naming service
    0xpfeffer
    @0xpfeffer
    Right
    Ivan Gotovchits
    @ivg
    yep... this is an issue, we should be able to resolve those names, but yes it is irrelevant to primus
    0xpfeffer
    @0xpfeffer
    Well primus seems to run fine, it just doesn't trigger my rules when names are missing, right?
    Ivan Gotovchits
    @ivg
    yep
    you can, as a workaround, put sub_<whatever-name-it-got> instead of read and write. We will provide a fix soon
    0xpfeffer
    @0xpfeffer
    Well in the incidents report, it calls all those __primus_linker_unresolved_call
    I'll just look into IDA integration, they have a new free version out anyways
    Ivan Gotovchits
    @ivg
    unfortunately bap will not work with the free version of IDA as they disable scripting and batch mode.
    0xpfeffer
    @0xpfeffer
    damn :smile:
    Ivan Gotovchits
    @ivg
    can you try the following:
    bap <path-to-exe> --symbolizer=objdump --dump-symbols | grep read
    0xpfeffer
    @0xpfeffer
    Doesn't make a difference
    Do you know why it works for open but not the rest?
    Ivan Gotovchits
    @ivg
    he, the answer to this question will provide the answer to the whole problem
    0xpfeffer
    @0xpfeffer
    :smile:
    In my other example (using the console instead of files), it only resolves fgets. Like in this example, it's the last in the list of code-entries where the start is set to 0.
    (code-entry puts 0 0)
    (code-entry __libc_start_main 0 0)
    (code-entry fgets 0 0)
    Ivan Gotovchits
    @ivg
    ok... we've found the bug... in fact we found it a couple of weeks ago, and believed that we fixed it the same day, though apparently we didn't push it upstream
    0xpfeffer
    @0xpfeffer
    (code-entry write 0 0)
    (code-entry close 0 0)
    (code-entry read 0 0)
    (code-entry __libc_start_main 0 0)
    (code-entry open 0 0)
    Sounds good :)
    Ivan Gotovchits
    @ivg
    its fixed and merged
    so please upgrade your bap it should work now
    0xpfeffer
    @0xpfeffer
    On it
    0xpfeffer
    @0xpfeffer
    Works :sparkles:
    Ivan Gotovchits
    @ivg
    cool! that bug was found the next day it was introduced and we were pretty sure that it was pushed upstream) Though apparently it wasn't
    0xpfeffer
    @0xpfeffer
    I get an incident-location at every fgets, but I don't get one at the corresponding puts. My rule is
    (defmethod call (name buf)
      (when (= name 'puts)
        (must/trust-string buf)))
    But I guess that has to wait till next week. I'm done for today, thanks a lot for your help again
    Have a great week
    Ivan Gotovchits
    @ivg
    So, most likely this is because puts is a macro that maps to some abi specific name, most likely _Io_puts or something like this. Try to use fputs instead
    0xpfeffer
    @0xpfeffer
    I used call instead of call-return in my taint rule, which may have reset the taint status. Now it works with
    (defmethod call-return (name buf len _ ret)
      (when (= name 'fgets)
        (untrust/region buf len)))
    Now, all I need is to figure out how to interpret the incident locations. Is there some documentation somewhere? My locations are reported with varying arguments, where the first seems to be the call location. The rest seems to be potential taint-source locations, right?
    Ivan Gotovchits
    @ivg
    yep, indeed, conceptually we should taint untrusted buffer after it is filled with the untrusted data. Otherwise, it is (a) not right, (b) could be indeed overwritten by the input data.
    Ivan Gotovchits
    @ivg

    Incidents reports are n-tuples, where the first element is a symbolic name of the incident, and the rest are locations of the points of interest. The notion of this points are specific to a concrete incident kind. We yet to add the declaration procedure, that will allow us to at least have a symbolic annotation to each point.
    In your case both incidents should have two points, if your code is the same is mine:

    (incident-report 'unchecked-untrusted-argument
        (incident-location)
        (dict-get 'taint-sources/untrusted t))
    
    (incident-report 'untrusted-argument
       (incident-location)
       (dict-get 'taint-sources/untrusted t))

    the first is the location of the place where the rule was violated, i.e., the sensitive sink. The second is the location where a taint was introduced (i.e., the untrusted input)
    Each location is denoted by the location identifier, with which we can associate arbitrary attributes. So far, it is only the incident-location attribute that is a backtrace of location, with the last point being always the exact point of the location.

    The documentation is in progress. This is a very new feature, still in its beta testing (besides, thanks for being the beta tester ;) )
    Also, I'm currently working on the gui for the incidents. I have a python code that will work with IDA and load incidents, as well as allow trace replaying, jumping to location and all the stuff
    SaraAdamTh
    @SaraAdamTh
    +4
    ssonnagi
    @ssonnagi
    i tried taint recipe from https://mirrors.aegis.cylab.cmu.edu/bap/recipes/taint.recipe
    I tried running with and without symbols. but i miss lot of info without symbols. with symbols it may have function name, but atleast it should have address when it is without symbols.
    But i am more interested in exe without symbols. How to write a policy for them?
    Ivan Gotovchits
    @ivg

    Well, a function whose name weren't recognized will get a bogus name sub_XXXX where XXXX is its address. So you can write a policy that will depend on that name.
    Not sure that this is what you're seeking for, as usually policies rely on API, to be general and reusable across binaries. So the bottom line is that the function names should be recognized. In bap the component responsible for symbol recognition is called symbolizer. We have a few symbolizers, including built-in, one that uses objdump, and another that uses IDA. You can use all three (for the last one you need IDA Pro, though), or you can also write your own.

    But before going to deep, you need to verify that you're getting maximum from the existing symbolizers. First of all, check the specification of the --symbolizer option in the bap --help output. If it doesn't it exist all then it means that you don't have any options to choose from, so you need to install the objdump symbolizers and/or ida symbolizer, if you have IDA Pro, corresponding opam packages are bap-objdump and bap-ida

    kriw
    @kriw

    @ivg Hi, I am reading bap-plugins/deadcode/deadcode.ml. (https://github.com/BinaryAnalysisPlatform/bap-plugins/blob/master/deadcode/deadcode.ml )
    I am wodering why no_side_effects is defined as

        let open Target.CPU in
        Var.is_virtual var || is_flag var

    I thought each variables in arg_t, def_t, phi_t does not have side effects.
    Could you explain why physical registers are considered it has side effects ?

    Ivan Gotovchits
    @ivg
    The deadcode elimination pass is intraprocedural, so we don't want to eliminate variables that could be used by other functions. So we rule out normal physical registers as an over-approximation. We have two more cases, virtual variables and flags.
    Since virtual variables do not represent any physical state in a CPU and are artifacts created by lifters (basically, for clarity), we can safely assume that any side effect created by assignment to such variable will not be seen in any other instructions, so we can safely eliminate them.
    The second clause, is_flag is an underapproximation, we assume that flags set in one subroutine are never used in another. They could of course, but not by a piece of code that was generated by a compiler. Since compilers do not treat flags as physical locations or data, they will never do optimizations on them.
    Note, that this is an example plugin, not really used in bap.
    kriw
    @kriw

    The deadcode elimination pass is intraprocedural, so we don't want to eliminate variables that could be used by other functions.

    I see, thank you.

    U+1F339
    @liquid_pascal_twitter
    I'm trying to learn how to use the Primus Lisp system. I followed this tutorial you guys wrote here (BinaryAnalysisPlatform/bap#812), but I'm not seeing any results. Could anyone point me to more information?
    Ivan Gotovchits
    @ivg
    Sure, I would suggest you to use the primus-checks recipe from the bap-recipes repository as the starting example, instructions are provided in the link. If you want to learn how to program new analysis in Primus using either OCaml or Primus Lisp, then you can learn the interfaces using our documentation, (here is the Lisp documentation). See also bap --primus-lisp-help.
    Daniel Peters
    @dtpeters
    just out of curiosity, how stable are the interfaces for primus? As in, how much will these change before 2.0 and will it change at all with the 2.0 update
    Ivan Gotovchits
    @ivg
    Primus 1.0 will be fully compatible with BAP 2.x, so after it will be released all code written for the current version of Primus will work without any changes.
    Of course, to get all benefits of BAP 2.x, one should consider switching to Primus 2.0.
    To highlight, Primus 2.0 will be more flexible and generic. It won't be a monad transformer anymore, just a monad. It won't depend on CFG or IR or BIL. Also, Primus Lisp will now become BAP Lisp, with a more general application area, for example, it would be now possible to write lifters and static analysis using BAP Lisp. The Lisp language itself will have some changes, like namespaces, for example.
    Daniel Peters
    @dtpeters
    awesome! thanks for the reply, as always @ivg