question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add parsing of OVERLAY section for PE files

See original GitHub issue

I’m currently playing with the crackme files found here https://github.com/Maijin/Workshop2015/tree/master/IOLI-crackme which are available for both Linux and Windows. I’ve been comparing the behavior of angr using matching binaries but different file formats.

I’d like to record a few of the differences here to: a) see if I’m crazy b) see if more specific tickets should be created c) see if some pointers / todo’s could be provided as to what specific things need to be implemented to have better pe/exe support.

The main function for the crackme0x00 file is (from Radare2):

[0x08048414]> pdf
/ (fcn) sym.main 127
|           ; var int local_6      @ ebp-0x18
|           ; DATA XREF from 0x08048377 (sym.main)
|           ;-- main:
|           0x08048414    55             push ebp
|           0x08048415    89e5           mov ebp, esp
|           0x08048417    83ec28         sub esp, 0x28
|           0x0804841a    83e4f0         and esp, 0xfffffff0
|           0x0804841d    b800000000     mov eax, 0
|           0x08048422    83c00f         add eax, 0xf
|           0x08048425    83c00f         add eax, 0xf
|           0x08048428    c1e804         shr eax, 4
|           0x0804842b    c1e004         shl eax, 4
|           0x0804842e    29c4           sub esp, eax
|           0x08048430    c70424688504.  mov dword [esp], str.IOLI_Crackme_Level_0x00_n ; [0x8048568:4]=0x494c4f49  LEA str.IOLI_Crackme_Level_0x00_n ; "IOLI Crackme Level 0x00." @ 0x8048568
|           0x08048437    e804ffffff     call sym.imp.printf
|           0x0804843c    c70424818504.  mov dword [esp], str.Password: ; [0x8048581:4]=0x73736150  LEA str.Password: ; "Password: " @ 0x8048581
|           0x08048443    e8f8feffff     call sym.imp.printf
|           0x08048448    8d45e8         lea eax, [ebp-local_6]
|           0x0804844b    89442404       mov dword [esp + 4], eax
|           0x0804844f    c704248c8504.  mov dword [esp], 0x804858c    ; [0x804858c:4]=0x32007325  ; "%s" @ 0x804858c
|           0x08048456    e8d5feffff     call sym.imp.scanf
|           0x0804845b    8d45e8         lea eax, [ebp-local_6]
|           0x0804845e    c74424048f85.  mov dword [esp + 4], str.250382 ; [0x804858f:4]=0x33303532  LEA str.250382 ; "250382" @ 0x804858f
|           0x08048466    890424         mov dword [esp], eax
|           0x08048469    e8e2feffff     call sym.imp.strcmp
|           0x0804846e    85c0           test eax, eax
|       ,=< 0x08048470    740e           je 0x8048480
|       |   0x08048472    c70424968504.  mov dword [esp], str.Invalid_Password__n ; [0x8048596:4]=0x61766e49  LEA str.Invalid_Password__n ; "Invalid Password!." @ 0x8048596
|       |   0x08048479    e8c2feffff     call sym.imp.printf
|      ,==< 0x0804847e    eb0c           jmp 0x804848c
|      |`-> 0x08048480    c70424a98504.  mov dword [esp], str.Password_OK_:__n ; [0x80485a9:4]=0x73736150  LEA str.Password_OK_:__n ; "Password OK :)." @ 0x80485a9
|      |    0x08048487    e8b4feffff     call sym.imp.printf
|      |    ; JMP XREF from 0x0804847e (sym.main)
|      `--> 0x0804848c    b800000000     mov eax, 0
|           0x08048491    c9             leave
\           0x08048492    c3             ret
[0x08048414]>

Similarly for the EXE:

 (fcn) sym._main 141
|           ; var int local_0_1    @ ebp-0x1
|           ; var int local_6      @ ebp-0x18
|           ; var int local_7      @ ebp-0x1c
|           ; CALL XREF from 0x00401222 (sym._main)
|           0x00401310    55             push ebp
|           0x00401311    89e5           mov ebp, esp
|           0x00401313    83ec38         sub esp, 0x38
|           0x00401316    83e4f0         and esp, 0xfffffff0
|           0x00401319    b800000000     mov eax, 0
|           0x0040131e    83c00f         add eax, 0xf
|           0x00401321    83c00f         add eax, 0xf
|           0x00401324    c1e804         shr eax, 4
|           0x00401327    c1e004         shl eax, 4
|           0x0040132a    8945e4         mov dword [ebp-local_7], eax
|           0x0040132d    8b45e4         mov eax, dword [ebp-local_7]
|           0x00401330    e83b190000     call 0x402c70                  ; sym.___w32_sharedptr_initialize+0x220
|           0x00401335    e836010000     call sym.___main
|           0x0040133a    c70424004040.  mov dword [esp], str.IOLI_Crackme_Level_0x00_n ; [0x404000:4]=0x494c4f49  LEA section..rdata ; "IOLI Crackme Level 0x00." @ 0x404000
|           0x00401341    e8ea190000     call sym._printf
|           0x00401346    c70424194040.  mov dword [esp], str.Password: ; [0x404019:4]=0x73736150  LEA str.Password: ; "Password: " @ 0x404019
|           0x0040134d    e8de190000     call sym._printf
|           0x00401352    8d45e8         lea eax, [ebp-local_6]
|           0x00401355    89442404       mov dword [esp + 4], eax
|           0x00401359    c70424244040.  mov dword [esp], 0x404024     ; [0x404024:4]=0x32007325  ; "%s" 0x00404024  ; "%s" @ 0x404024
|           0x00401360    e8bb190000     call sym._scanf
|           0x00401365    8d45e8         lea eax, [ebp-local_6]
|           0x00401368    c74424042740.  mov dword [esp + 4], str.250382 ; [0x404027:4]=0x33303532  LEA str.250382 ; "250382" @ 0x404027
|           0x00401370    890424         mov dword [esp], eax
|           0x00401373    e898190000     call sym._strcmp
|           0x00401378    85c0           test eax, eax
|       ,=< 0x0040137a    740e           je 0x40138a
|       |   0x0040137c    c704242e4040.  mov dword [esp], str.Invalid_Password__n ; [0x40402e:4]=0x61766e49  LEA str.Invalid_Password__n ; "Invalid Password!." @ 0x40402e
|       |   0x00401383    e8a8190000     call sym._printf
|      ,==< 0x00401388    eb0c           jmp 0x401396
|      |`-> 0x0040138a    c70424414040.  mov dword [esp], str.Password_OK_:__n ; [0x404041:4]=0x73736150  LEA str.Password_OK_:__n ; "Password OK :)." @ 0x404041
|      |    0x00401391    e89a190000     call sym._printf
|      |    ; JMP XREF from 0x00401388 (sym._main)
|      `--> 0x00401396    b800000000     mov eax, 0
|           0x0040139b    c9             leave
\           0x0040139c    c3             ret

First of all, angr doesn’t seem to recognize all the symbols in the exe. For example, the following command runs fine on the Linux version, but not the exe: main = proj.loader.main_bin.get_symbol('main')

I’m not sure if it’s related or not, but creating cfg from main (using the address of main as a start because the symbol is not found as previously shown) produces two very different CFGs even though it’s apparent from the disassembly above that they should be the same. On the ELF side, the CFG is as expected, with 9 basic blocks. 0x00_cfg (Pictures made using https://github.com/axt/angr-utils)

On the PE side, the CFG has several hundred blocks, seeming to be spanning into other functions. 0x00_exe_cfg. Note: Main is actually way down in the bottom right corner in this huge graph. Also note, the function name for each node in this graph is “None” instead of the actual name as seen in the Linux graph (probably the same problem as above).

Finally, on the ELF side, angr automatically hooks functions like scanf and printf, but on the PE side that does not appear to work. I’m able to symbolically solve both the PE and ELF versions, but on the PE file, I have to manually setup hooking first.

Maybe all these problems come down to the function symbols not being properly found, I’m not sure.

So… Brain dump I know, but please feel free to split these into as many tickets as you feel is appropriate. I would love to see some pointers to jumping-off points where someone new to the project could start with helping resolve some of these discrepancies. Thanks!

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Reactions:1
  • Comments:7 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
rhelmotcommented, Jun 16, 2016

So most of this is known - a vast amount of effort has gone into ELF and Linux support in angr, and almost none into Windows.

The majority of these problems are unrelated to CLE - the only one that is CLE-related is the get_symbol call failing. The rest boil down to, to the best of my knowledge, lack of support for callee-cleanup calling conventions everywhere in angr, and lack of SimProcedures for Windows libraries. There also might be some Linux-specific logic in the CFG algorithm.

The only person on the angr team who actually knows how Windows binaries work is @ltfish, but more to the point this is not something we can give a lot of attention to before the CGC in August.

As for an initial jumping off point, CLE’s support for PE files is based on the pefile module. If I recall correctly, we make no attempt whatsoever to provide symbols for PE files whatsoever, so if you want to look into how pefile exposes symbols and perform the translation necessary to implement get_symbol for the PE backend, that’d be pretty cool.

0reactions
github-actions[bot]commented, Aug 7, 2022

This issue has been closed due to inactivity.

Read more comments on GitHub >

github_iconTop Results From Across the Web

PE File Overlay Extraction - Page 2 - AutoIt Example Scripts
However, why should the compiled exe check itself for an overlay, rather than just check other executables when invoked that way?
Read more >
PE Parser - Chuong Dong
A simple PE Parser written in C++. ... In order to be able to parse and extract sections in a PE file, we...
Read more >
Reliable algorithm to extract overlay of a PE
If not an overlay, the file won't load. You need something like this (and filealign comes from the PE header): long pointerToRaw =...
Read more >
PE Tools | Portable executable (PE) manipulation toolkit
PE Tools lets you actively research PE files and processes. Process Viewer and PE files Editor , Dumper , Rebuilder , Comparator ,...
Read more >
PE Parser: A Python package for Portable Executable files ...
PE Parser can be used to extract features from Portable Executable files in one of the following two formats: ... These file formats...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found