Merge simuvex and angr
See original GitHub issueCurrent Status
SimuVEX is no more. It’s been absorbed into angr, and lives on as a compatibility stub. Path is gone, and SimulationManager is the new PathGroup. The future is here.
The Plan
The current plan is to take this in 3 phases:
- Phase 1: merge simuvex into angr, redistribute the files under the angr module. No functionality changes.
- angr passes
- angr-bf passes
- angr-doc passes
- tracer passes
- rex passes
- colorguard passes
- identifier passes
- fidget passes
- patcherex passes
- angrop passes
- Phase 2: remove Path, make PathHistory a SimStatePlugin called StateHistory, rename PathHierarchy to StateHierarchy, rename PathGroup to SimContext.
- angr passes
- angr-bf passes
- angr-doc passes
- tracer passes
- rex passes
- colorguard passes
- identifier passes
- fidget passes
- patcherex passes
- angrop passes
- Phase 3: full backwards compatibility – stub modules for
simuvex
,angr.Path
, and alias forangr.PathGroup
- simuvex stub
- pass-throughs for state.log and state.scratch
- alias for PathGroup
- alias for
state.state
- alias for
state.length
- old angr-doc examples pass
- old angr testcases pass
- old angr-bf testcases pass
- old tracer passes
- old rex passes
- old colorguard passes
- old identifier passes
- old fidget passes
- old patcherex passes
- old angrop passes
- Phase 4: clean up remaining issues in the SimuVEX repo.
- Phase 5: squash angr commits to make Fish happy
Original Musings
@rhelmot and I had a discussion that started out with my frustration that adding a syscall to angr requires two PRs (one to SimuVEX for the SimProcedure and one for angr to change the SimOS syscall table). @rhelmot made an unorthodox suggestion that I’ll reiterate toward the end of this issue. First, some observations:
- Some of the modules we’ve written for angr are being used in other projects. This includes CLE, PyVEX, claripy, and archinfo. One module conspicuously missing from this list (as far as I know) is SimuVEX.
- This is probably because the other modules have very well-defined, self-contained purposes, separate from angr. SimuVEX, on the other hand, lives as a tightly-coupled state mutation engine for angr (you can probably begin to see where this is going now).
- On the angr side, over the last year,
Path
has become and more and more thin wrapper aroundSimState
,PathHistory
, andCallStack
(the last of which really belongs inPathHistory
, anyways), and convenience accessors for those (i.e.,path.events
,path.actions
,path.jumpkind
, and the like are all pass-throughs to data stored inPathHistory
). - State options, while technically a SimuVEX construct and technically tied to the state, is used to control behavior all over the place in angr (i.e., in the CFG, in
PathHistory
, inPathHierarchy
, etc). - SimInspect breakpoints, also technically a SimuVEX construct, have all sorts of angr-triggered events.
- We’re constantly running into problems where we don’t have access to the path or project from SimuVEX (specifically, from
SimInspect
hooks). - Because we run into the same problem tracking information over successive paths as we do over successive states, we’ve invented
path.info
, which is basically a stripped downSimStatePlugin
. - We’ve had debates on, for example, pulling SimProcedures out of SimuVEX, pulling SimOS out of angr and into some weird joint package with the SimProcedures, etc.
- SimEngines are currently split across two different packages:
SimEngineVEX
,SimEngineProcedure
, andSimEngineUnicorn
are in SimuVEX,SimEngineFailure
andSimEngineSyscall
are in angr. - SimuVEX is a weird name for a state mutation engine that supports multiple execution/translation engines.
Basically, the decoupling between angr and simuvex is not really there, unlike it is with CLE, PyVEX, claripy, archinfo, etc. A proposed solution is to eliminate simuvex itself, and merge it into angr. The changes would be:
- SimuVEX would be merged into the angr repo.
Path
would be eliminated. EachSimState
would have astate.history
attribute with thePathHistory
(now probably calledStateHistory
). This would eliminate the weird thing where we copy in everything from thestate.log
plugin intopath.history
every state – they could be the same entity.PathGroup
would now wrangle stashes of states instead of stashes of paths.
To make the transition smoother, we’d have the following deprecation support:
- If we create a
state.state
property, people can still do stuff likepath_groups.found[0].state
as before. Likewise for properties forstate.history
(which would be the replacement ofstate.log
, anyways),state.callstack
, and so on. - We’d probably have to leave a dummy simuvex package for a while, with deprecated passthroughs to the old concepts.
This is a big conceptual change – it’s angr 7.7.4.1 sort of stuff for sure, but it seems to me that it would make sense and would simplify using and extending angr.
Issue Analytics
- State:
- Created 7 years ago
- Reactions:5
- Comments:14 (14 by maintainers)
Top GitHub Comments
As a side note: I’m against blowing away
Path
.Path
is still useful as a descriptor of what addresses are come across, what events have happened, and what states there are during an execution. It seems to me that usingstate.history
for that purpose is an abuse.I banish this issue to the depths of hell