Why solutions generated by angr can not be reproduced?
See original GitHub issueDear developers,
May I ask a question here, which I was issued in Slack but it seems no responses yet? Here, I just copy most of the content and look forward to your suggestions or ideas. Thank you so much in advance!
The question is about why the solutions generated by angr
can not be reproduced? Please check the following example for more details. The unreproduced binary b-gcc-O1
is compiled under gcc-11
with -O1
optimization option, which takes a string as input. (I used argv[1]
to pass the argument in the source code program).
My goal here is to use angr
is to feed a symbolic input to the target binary (e.g., b-gcc-O1
) and then let it automatically find all possible execution paths in that binary. I expect the generated solutions by angr
could help exactly reproduce the execution paths which are explored before. (I think reproducibility could be the most powerful weapon in symbolic execution, right?). However, it seems not so true exactly.
Here is the script (angr-test.py
) I used:
import angr
import claripy
import sys
target_binary = sys.argv[1]
sym_input = claripy.BVS("sym_input", 8*2)
p = angr.Project(target_binary, load_options={'auto_load_libs': False})
state = p.factory.entry_state(args=[target_binary, sym_input], add_options={angr.options.ZERO_FILL_UNCONSTRAINED_MEMORY, angr.options.ZERO_FILL_UNCONSTRAINED_REGISTERS})
sm = p.factory.simulation_manager(state)
sm.run()
print(sm)
# constrain the symbolic input to fall within printable range
for byte in sym_input.chop(bits=8):
state.add_constraints(byte >= 'A'.encode(), byte <= 'Z'.encode())
for i in range(len(sm.deadended)):
print("No.", i, " solution:")
print("output: ", sm.deadended[i].posix.dumps(1).decode('utf-8'))
print("input: ", sm.deadended[i].posix.dumps(sys.stdin.fileno()))
b = sm.deadended[i].solver.eval(sym_input, cast_to=bytes).rstrip(b'\x00')
print("org b: ", b)
print("hex b: ", b.decode("utf-8", "ignore"))
Then I run python3 angr-test.py b-gcc-O1
and got the following outputs:
$ python3 angr-test.py b-gcc-O1
<SimulationManager with 4 deadended>
No. 0 solution:
output: 0 3 6 c 6 c 6 c 6 c 84 145 b 3 6 c 6 c 6 c 6 c 84 145 b 3 6 c 6 c 6 c 6 c 84 145 b 3 6 c 6 c 6 c 6 c 84 145 b
input: b''
org b: b'+3'
hex b: +3
No. 1 solution:
output: 0 3 6 c 6 c 6 c 6 c 87 92 92 92 92 145 b 3 6 c 6 c 6 c 6 c 84 145 b 3 6 c 6 c 6 c 6 c 84 145 b 3 6 c 6 c 6 c 6 c 84 145 b
input: b''
org b: b'7g'
hex b: 7g
No. 2 solution:
output: 0 3 6 c 6 c 6 c 6 c 84 145 b 3 6 c 6 c 6 c 6 c 87 92 92 92 92 145 b 3 6 c 6 c 6 c 6 c 87 92 92 92 92 145 b 3 6 c 6 c 6 c 6 c 87 92 92 92 92 145 b
input: b''
org b: b'c1'
hex b: c1
No. 3 solution:
output: 0 3 6 c 6 c 6 c 6 c 87 92 92 92 92 145 b 3 6 c 6 c 6 c 6 c 87 92 92 92 92 145 b 3 6 c 6 c 6 c 6 c 87 92 92 92 92 145 b 3 6 c 6 c 6 c 6 c 87 92 92 92 92 145 b
input: b''
org b: b'\x0ff'
hex b: f
The numbers in the output indicate a set of block id
s (c
= continue; b
= break ) which are executed by each state (by calling printf("id");
in source code of b-gcc-O1
).
Then, when I try to reproduce the explored paths (sequence of numbers) using the resolved solution given by angr
, I can not make it. Here are the reproducing results:
$ ./b-gcc-O1 "+3"
0 3 6 c 6 c 6 c 6 c 84 145 b 3 6 c 6 c 6 c 6 c 84 145 b 3 6 c 6 c 6 c 6 c 84 145 b 3 6 c 6 c 6 c 6 c 84 145 b
$ ./b-gcc-O1 "7g"
0 3 6 c 6 c 6 c 6 c 87 92 92 92 92 145 b 3 6 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 6 c 6 c 6 c 84 145 b 3 6 c 6 c 6 c 6 c 84 145 b 3 6 c 6 c 6 c 6 c 84 145 b
$ ./b-gcc-O1 "c1"
0 3 6 c 6 c 6 c 6 c 84 145 b 3 6 c 6 c 6 c 6 c 87 92 92 92 92 145 b 3 6 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 6 c 6 c 6 c 87 92 92 92 92 145 b 3 6 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 6 c 6 c 6 c 87 92 92 92 92 145 b
$ ./b-gcc-O1 "\x0ff"
0 3 6 c 6 c 6 c 6 c 87 92 92 92 92 145 b 3 6 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 6 c 6 c 6 c 87 92 92 92 92 145 b 3 6 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 6 c 6 c 6 c 87 92 92 92 92 145 b 3 6 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 6 c 6 c 6 c 87 92 92 92 92 145 b
It seems only one (No.0 with input +3
) of the solutions can be reproduced with the corresponding paths reported by angr
’s execution (by python3 angr-test.py b-gcc-O1
). I guess angr
should not behave like that, right? Did I mistakenly use angr
or anything I missed here?
A possible reason behind this may be that doesangr
have some trouble dealing with a binary compiled with higher compiler optimizations? To say why I re-compiled the source code with gcc-11 -O0
to generate a new binary b-gcc-O0
and use angr
again. Here are the new outputs:
$ python3 angr-test.py b-gcc-O0
<SimulationManager with 4 deadended>
No. 0 solution:
output: 0 3 6 c 6 c 6 c 6 c 84 145 b 3 6 c 6 c 6 c 6 c 84 145 b 3 6 c 6 c 6 c 6 c 84 145 b 3 6 c 6 c 6 c 6 c 84 145 b
input: b''
org b: b'E\xc3'
hex b: E
No. 1 solution:
output: 0 3 6 c 6 c 6 c 6 c 87 92 92 92 92 145 b 3 6 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54
59 63 63 63 63 6 c 6 c 6 c 84 145 b 3 6 c 6 c 6 c 6 c 84 145 b 3 6 c 6 c 6 c 6 c 84 145 b
input: b''
org b: b'Cf'
hex b: Cf
No. 2 solution:
output: 0 3 6 c 6 c 6 c 6 c 84 145 b 3 6 c 6 c 6 c 6 c 87 92 92 92 92 145 b 3 6 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63
63 63 12 16 16 16 16 22 54 59 63 63 63 63 6 c 6 c 6 c 87 92 92 92 92 145 b 3 6 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63
63 12 16 16 16 16 22 54 59 63 63 63 63 6 c 6 c 6 c 87 92 92 92 92 145 b
input: b''
org b: b'1\xb5'
hex b: 1
No. 3 solution:
output: 0 3 6 c 6 c 6 c 6 c 87 92 92 92 92 145 b 3 6 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54
59 63 63 63 63 6 c 6 c 6 c 87 92 92 92 92 145 b 3 6 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59
63 63 63 63 6 c 6 c 6 c 87 92 92 92 92 145 b 3 6 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 63 63 63 63 12 16 16 16 16 22 54 59 6
3 63 63 63 6 c 6 c 6 c 87 92 92 92 92 145 b
input: b''
org b: b'D0'
hex b: D0
This time I can reproduce the solution as the anger reported before.
So, could you please help check what could be the problems here? Is the wrong usage of mine or a possible issue in angr
? If it’s the latter case, in what situations that solutions given by angr
can not be reproduced (except environment modeling barriers in symbolic execution, e.g., usage of syscalls)? The tested binary does not involve many environmental issues here.
I tested the binaries in a ubuntu 18.04 system using angr-dev
version and here are the used binaries:
b-gcc-O1 and b-gcc-O0.zip
Thank you so much for your help and look forward to your insightful reply!
Best regards, Haoxin
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (3 by maintainers)
Top GitHub Comments
We figured it out! The culprit was a bad IR optimization introduced in this last december. I’ve fixed it in https://github.com/angr/vex/commit/5e416aa5e8d10875d8a3bd2a9739d107d2ba0075.
Wow, thank you so much for the quick diagnosis of the root cause of the issue and insightful comments, @rhelmot!
I am also surprised the culprit is the simplest
printf
(even without any arguments), which may indicate that environment modeling is still one of the major challenges to performing practical symbolic execution.May I ask some further questions for you related to the binary analysis? My initial idea here is to find the semantic divergence between two binaries (
step 1
: find the divergence;step 2
: trigger the divergence), so I insert the execution block id as a measurement to specify a unique execution path. Based on the issues I encountered, I may need to use another way to do it, as least I need to avoid as many as possible environment modeling issues hidden in symbolic execution, thus reducing false alarms when I try to find compiler bugs.So my question is do you think such an idea can work in practice? I am not an expert on binary analysis. In theory, I think it could work but in fact, it may be obstructed by other challenges. For now, the most confusing part is that when I feed two binaries compiled from the same source code (with the same input but maybe with different optimization levels) to
angr
, it reports the different numbers ofdeadended
stashes upon the two binaries. Theoretically speaking, this could be a potential bug in compilers as there exist some paths that can not be found in another binary. However, the divergences do not exist when I performstep 2
, meaning the unique solutions (related to the divergence path) can still reproduce the paths that are missed byangr
’s symbolic execution. I guess this could not be an issue inangr
, and maybe we can only say there are some limitations inangr
’s symbolic execution? What do you think? Is it possible to find and trigger a divergence path between two binaries in practice, specifically by usingangr
? What are the possible barriers inangr
if it can not be applicable in practice? Do you know any solutions that useangr
to detect semantic differences in binaries? (I have tried to find one, but failed).I am so sorry to take up too much time of yours for helping me out, and hoping you don’t mind it!
Thanks, Haoxin