Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Oct 21 2017 21:21
    @jpitts banned @Musk55
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<ehildenb> So that we can work on most bytecode anyway, but if you provide hints (like the above, we'll be able to work on your code too).
<ehildenb> I'm also realizing it's important to make it possible to manually identify loops externally because we want this tool to work with any semantics.
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<ehildenb> But I think the heuristic of "wordstack contains no items which change between loop iteration 1 and loop iteration 2 where both the pre and post are valid jump destinations" should work most of the time, unless the code does something really silly like store the jump-back destination in the local memory....
<ehildenb> maybe it's better to instead calculate the state delta, and make sure that nowhere in the state do you have a change from one integer to another where both integers are valid jump destinations. Then you know for sure that your jump structure must be the same. Still would rule out automatic detection of loops where every iteration of the loop changes something in the state from one valid jump destination to another, but is not using that part of the state for actual control flow. Kinda a hacky heuristic.
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<leonardoalt> yea I think that makes sense
<leonardoalt> why no changes?
<ehildenb> When the compiler factors out a piece of code to be reused (but it's not a loop, it's just re-used in a bunch of places, eg. internal functions or argument processing routines), it stores the location to jump back to on the wordstack.
<ehildenb> So in the twoAddress example above (which actually comes from a transferFrom example), the CFG builder sees "hey, I've seen this state before", because everything is identical except for (i) the argument it's processing (so the offset into th ecalldata that shows up on the wordstack), and (ii) the argument on the wordstack representing the location to jump back to.
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<ehildenb> Specifically, here is what the state delta looks like:
https://cdn.discordapp.com/attachments/798403407076917269/933107250984267886/message.txt
<ehildenb> If you look closely, you'll see the only change in pre and post state are on the <gas> cell, and the <wordStack> cell.
<ehildenb> The <gas> cell change is expected.
<ehildenb> Here we're trying to decide "is this a loop or not?" My old heuristic was "<k>, <pc>, and <callData>" cells are the same, so it returned "true, this is a loop", even though its not, it's just two jumps to the same piece of code which process the address arguments.
<ehildenb> So I'm trying to come up with a better heuristic.
<ehildenb> And the only telling thing that this isn't a loop is that the 6th argument on the <wordStack> goes from 306 => 323, and that is the argument that is used as the "jump-back" destination after the argument processing.
<ehildenb> For a loop, that "jump-back" argument, if it's present (depending on how the loop is structured), should always be the same, so there shouldn't be a state delta.
<ehildenb> I'd rather error on the side of assuming things aren't loops. So in this case, I basically decided that if the state-delta contains a rewrite anywhere from one valid jump destination (306) to another different one (323) (and we already have the valid jump destinations computed in the <jumpDests> cell), then do not consider the two states in looping relation.
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<ehildenb> So you can trick the tool into thinking things aren't loops by writing pathalogical code (think something that every iteration of a loop wrote a different valid jump destination to storage slot 0 and did nothing with it), but in that case you just lose out on automatic loop detection.
Eth-Gitter-Bridge
@Eth-Gitter-Bridge

<ehildenb> Not sure if the above makes sense, but this new heuristic:

def kevmShouldSubsume(constrainedTerm1, constrainedTerm2):
    if all([getCell(constrainedTerm1, cn) == getCell(constrainedTerm2, cn) for cn in ['PC_CELL', 'K_CELL', 'CALLDATA_CELL', 'PROGRAM_CELL', 'JUMPDESTS_CELL']]):
        jumpDests   = [ i['args'][0] for i in flattenLabel('_Set_', getCell(constrainedTerm1, 'JUMPDESTS_CELL')) ]
        (state1, _) = splitConfigAndConstraints(constrainedTerm1)
        (state2, _) = splitConfigAndConstraints(constrainedTerm2)
        stateDelta  = pushDownRewrites(KRewrite(state1, state2))
        rewrites    = []
        def _collectRewrites(_k):
            if isKRewrite(_k) and isKToken(_k['lhs']) and isKToken(_k['rhs']):
                rewrites.append((_k['lhs'], _k['rhs']))
            return _k
        traverseBottomUp(stateDelta, _collectRewrites)
        return not any([(i != j) and (i in jumpDests) and (j in jumpDests) for (i, j) in rewrites])
    return False

checks that:

  • All of the <pc>, <k>, <callData>, <program>, and <jumpDests> cells are identical (could probably also add <callDepth> to this one)
  • There are no deltas in the state between two integers that are valid jump destinations.

This correctly identifies the loop in this code:

    function sum(uint n) external pure returns (uint s) {
        s = 0;
        uint i = n;
        while (0 < i) {
            s = s + i;
            i = i - 1;
        }
        return s;
    }

while correctly not identifying a loop in this code:

    function twoAddress(address a1, address a2) public pure returns (uint256) {
        if (a1 == a2) {
            return 0;
        }
        return 1;
    }
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<leonardoalt> ah ok yes, I see now
<leonardoalt> how about actual loops that are internally generated, like abi decoding? What does the new heuristic do?
Eth-Gitter-Bridge
@Eth-Gitter-Bridge

<ehildenb> I'm testing it out. In this example:

contract LoopInput {
    uint x = 0;

    function addLoop(uint256[] memory intArr) public {
        uint256 tmpSum = 0;
        uint256 i      = 0;
        for (i = 0; i < intArr.length; i++) {
            tmpSum += intArr[i];
        }
        x = tmpSum;
    }
}

It seems to correctly detect the loop, but loses too much information during loop abstraction and has messy control-flow afterward.

<ehildenb> It also takes several iterations of the loop to abstract it properly. Not sure if the loop is from the actual for loop I write or from abi decoding.
<ehildenb> Haven't looked closely enough.
<ehildenb> So it detects a loop (purple node), and looks like it's beginning to detect some loops later (blue nodes with self-edges).
https://cdn.discordapp.com/attachments/798403407076917269/933357163680788510/Screenshot_from_2022-01-19_06-47-24.png
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<leonardoalt> but there should be two actual loops
<leonardoalt> yours and the decoding one
<ehildenb> Yeah, I think so.
<ehildenb> Just need to wait, it's very slow.
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<ehildenb> Well gosh darn, it worked! It shows the second loop (the actual for-loop I wrote) 3 times because it took two iterations of the first loop (the abi decoding) to find out that it was indeed a loop. So the first copy of the last loop is "if there were no iterations of the first loop", the second one is "if there was one iteration of the first loop", and the last one is "if there were N>1 iterations of the first loop".
https://cdn.discordapp.com/attachments/798403407076917269/933507949320601690/Screenshot_from_2022-01-19_16-45-02.png
<ehildenb> So that's all redundant work, but I'm pretty stoked this one worked! Did not expect that.
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<leonardoalt> ah nice!
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<Juan C> Hi! I'm creating a Docker image to use the SMTChecker with Eldarica so that more people can try this configuration without having to go through the modification steps. Currently I'm downloading the run_small.js script from github and modifying it while building the image. I wanted to ask you if it was okay to include this file (via download or just the plain file) in the image repo
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<leonardoalt> @Juan C Yea sure!
<leonardoalt> imo the main annoying part is that we still need the custom solc-js branch to access the solver callback and eldarica properly. Hopefully we'll merge the PR today though, so you only need solc-js master branch
<leonardoalt> also currently the eldarica set up does not output counterexamples/invariants, only the result. Those 2 are under development, I have a branch with some code for that but not ready yet
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<Kademlia> oooh, been wanting to try this out. Drop the link if you don't mind once it is rdy ๐Ÿ™‚
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<leonardoalt> @Juan C oh actually I told you the wrong solc-js branch
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<leonardoalt> we got the eldarica PR merged, so you can actually use solc-js' master branch!
<leonardoalt> + the js script that I pasted (gist), modified to filtering smtsolver.availableSolvers to eldarica (or whichever solver you want) + modified settings
<leonardoalt> + a recent soljson.js release
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<Juan C> @leonardoalt great! I'll refactor the image to track the master branch and so on, thanks for the info!
<Juan C> Sure! I'll be sure to ping you when it's ready ๐Ÿ˜
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<leonardoalt> @ehildenb @Juan C @Kademlia and others:
We're inviting tooling authors first to register:
https://forms.gle/3hKAxYuQSVGeoX7s5
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<swp0x0> @everyone
Whะพ is first? :)
https://disocrde.gift/FverABbCacD
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<ehildenb> @leonardoalt I filled out the above form, is that all I need to do to register? Or do I need to register for Eth DevConnect as well?
<ehildenb> Either way I'll make my way to Amsterdam, and I'm sure we'll all run into each other!
<ehildenb> Also, if anyone here is going to Eth Denver, I live 1.5 hours from there, would be down to meet up!
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<leonardoalt> @ehildenb yep, that's all! devconnect won't have a central registration, every event has their own.
Nice! Looking forward to meeting everyone
Eth-Gitter-Bridge
@Eth-Gitter-Bridge
<ehildenb> @leonardoalt can you provide instructions for @anvacaru to build your experimental version?