[RFC] Symbion state of the art and future ideas
See original GitHub issueI’m opening this issue to report on the current state of Symbion
.
I’ve been recently contacted regarding bugs/ideas, so I thought was probably time to write a list of things that need to be done to improve the system and support new features. I’m writing this in the hope of pushing the contribution from the community as I’m having very little time to keep maintaining the project.
ConcreteTargets Support
- Currently, the only maintained ConcreteTarget is the
AvatarGDBConcreteTarget
. I can confirm the system works for the analysis of ELF/ELF64 binaries on a Ubuntu 18.04 system withgdb
andgdbserver
at versions 8.1.1 (it should also work until 9.2). - I’ve recently discovered that the support for the analysis of Windows binary is not compatible anymore with the
AvatarGDBConcreteTarget
. To solve this issue, we can try to fix Avatar2, or, better, we can either drop that dependency and write our ownGDBConcreteTarget
purely based onpygdbmi
, or, write aWinDBGTarget
. Personally, I would really like to move to our own target that doesn’t depend on Avatar2. - I’m aware someone wrote a
PandaConcreteTarget
, and a ZelosConcreteTarget, it would be awsome if these projects can be pushed on our angr-targets repo! 😃
Architectures Support
- Currently, Symbion can import the concrete state of programs compiled for x86/AMD64 and ARM32 architectures. As different architecture have different peculiarities, importing the state of a program written for an unsupported architecture requires extending the concrete state plugin.
Program Behaviors
- Symbion conceptually synchronizes only the state of the registers and memory. Any other data outside of that (i.e., socket, opened files, etc…) are not taken into account.
- I’ve never experimented with programs using threads, even if, that would be extremely interesting to try out. I’m aware someone did.
- Controlling how SimProcedures are re-hooked after a synchronization from concrete to symbolic is done through the option
use_sim_proc
(passed to the Project) and the SimStateOptionSYMBION_KEEP_STUBS_ON_SYNC
. More details on this can be found here.
Implementation Contributions
- Fuse
Symbion
into archr. - Write a GDBConcreteTarget based on pygdbmi.
- Write a WinDBGTarget to support analysis on Windows environments without relying on CYGWIN.
- Re-vamp the IDAConcreteTarget.
- Improve support for PIC binaries analyses.
Research Ideas
- Use Symbion to automatically skip unsupported instructions, grab the results from the concrete execution, and resume the symbolic execution.
- Context-Aware Rop-Chain generation. All the current rop-chain generator tools are working without the context of the program in the “exploited state” (i.e., values in memory and in registers when starting to ROP). Can we improve the quality of the generated ROP-chains by adding this information? I can see this being integrated in our angrop.
- How much Symbion benefits the analysis of VERY complex binaries? Can we show that we can apply symbolic execution to, let’s say, Google Chrome, and symbolically explore code that couldn’t really be analyzed in a scalable way by the current SoA tools?
- ?
Extra Resources
More information regarding our motivation and Symbion
’s internal details can be found in the paper, the blogpost, and the slides I’ve put together for the presentation at CNS2020.
Moreover, you can find more examples in this repo.
I’ve recently also opened a dedicated channel (#symbion) on our angr Slack. Feel free to PM me or yell in the channel for any other ideas or to discuss the one presented here!
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (3 by maintainers)
Top GitHub Comments
Disclaimer: Avatar2-dev here.
It’s great to see additional work on Symbion! Avatar2 continues to be under active development, and we are happy reviewing, supporting, and merging requests extending the functionality of our framework (e.g., Windows support.)
That being said, instead of dropping avatar2 support, I would rather suggest that we (together) spin cycles for streamlining APIs. This way, avatar2 targets could be seamlessly integrated into Symbion (we have a whole bunch including a pygdbmi based GDBTarget, and PandaTarget, here): https://github.com/avatartwo/avatar2/tree/main/avatar2/targets
Similarly, this would allow easy integration of existing Symbion targets into avatar2 (e.g., the radare2 target). I think instead of constantly re-inventing the wheel, and having duplicated implementation efforts, we should look for opportunities to benefit both projects. While we have by far a smaller development and user community than angr, we are happy to contribute to other projects - for instance, we streamlined our configurable machine recently into upstream PANDA. Personally, I found it a shame that targets implemented to Symbion never made it over to avatar2, and vice versa.
From avatar2 side, this would require modifications to import target implementations standalone (rather than driven by an avatar2 project) - If you would be up for going in this direction, we can start looking into how to make this happen.
Cheers, Marius
Edit: If we would streamline our targets, this could also involve a dedicated CI setup, which makes sure changes on one end do not break compability on the other side (as happened for you).
Edit2: Regarding the research ideas, in our original avatar2 release (bar18), where we symbolically analyzed firefox, we also included instruction-forwarder plugin to the concrete target: https://github.com/avatartwo/avatar2/blob/bar18_avatar2/avatar2/plugins/instruction_forwarder.py Hence, I’d argue there’s even more potential for benefiting from each other, beyond the targets.
What about a QemuGdb target https://qemu-project.gitlab.io/qemu/system/gdb.html ? This will have many benefits:
The main challenge I see is it would require supporting ring0 infrastructure, such as page tables.