Decompiler misinterpreting x86 'switch' boundaries
See original GitHub issueDescribe the bug
In the following assembly (from address 1008:151f
onwards), there is a guard around the effective switch
statement that ensures values outside the range 2 < AX <= 9
use the same branch as caseD_4
:
1008:1510 81 3e a2 CMP word ptr [g_nUnreadSamplesInBuffer_24a2],498
24 f2 01
1008:1516 7d 07 JGE LAB_1008_151f
1008:1518 83 3e 72 CMP word ptr [DAT_1030_0072],0x0
00 00
1008:151d 74 31 JZ LAB_1008_1550
LAB_1008_151f XREF[1]: 1008:1516(j)
1008:151f c4 5e fa LES aBuffer,[BP + pwf]
1008:1522 26 8a 07 MOV AL,byte ptr ES:[aBuffer]
1008:1525 98 CBW
1008:1526 2d 02 00 SUB AX,0x2
1008:1529 8b d8 MOV aBuffer,AX
1008:152b 83 fb 07 CMP aBuffer,0x7
1008:152e 77 1a JA switchD_1008:1532::caseD_4
1008:1530 03 db ADD aBuffer,aBuffer
switchD_1008:1532::switchD
1008:1532 2e ff a7 JMP word ptr CS:[aBuffer + 0x15f4]
f4 15
switchD_1008:1532::caseD_3 XREF[1]: 1008:1532(j)
switchD_1008:1532::caseD_5
switchD_1008:1532::caseD_9
switchD_1008:1532::caseD_2
1008:1537 c7 06 72 MOV word ptr [DAT_1030_0072],0x0
00 00 00
1008:153d c7 06 b0 MOV word ptr [BOOL_1030_12b0],0x0
12 00 00
1008:1543 9a 80 17 CALLF FUN_1010_1780 undefined2 FUN_1010_1780(void)
10 10
1008:1548 eb 06 JMP LAB_1008_1550
switchD_1008:1532::caseD_6 XREF[2]: 1008:152e(j), 1008:1532(j)
switchD_1008:1532::caseD_7
switchD_1008:1532::caseD_8
switchD_1008:1532::caseD_4
1008:154a c7 06 72 MOV word ptr [DAT_1030_0072],0x1
00 01 00
However, in the decompilation the cases aren’t setup at all and the ‘guard’ becomes an if (true)
. Here’s the decompilation:
if ((497 < g_nUnreadSamplesInBuffer_24a2) || (DAT_1030_0072 != 0)) {
if (true) {
switch(*pwf) {
default:
DAT_1030_0072 = 0;
BOOL_1030_12b0 = (0 | FALSE);
FUN_1010_1780();
goto LAB_1008_1550;
}
}
DAT_1030_0072 = 1;
}
LAB_1008_1550:
if (DAT_1030_0072 == 0) {
.
.
.
Expected behavior The output should recover the correct braches and guard condition given the accurate assembly.
Environment (please complete the following information):
- OS: Windows 10
- Java Version: 11.0.2
- Ghidra Version: 10.1.4
- Ghidra Origin: locally built
Additional context Add any other context about the problem here.
Issue Analytics
- State:
- Created a year ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
GAS x86: Reading a jump table / interpreting a switch statement
The jump table is an array of pointers, not code. The disassembler doesn't know that, so it decodes the bytes as instructions.
Read more >Is there a tool capable of reconstructing structured code from ...
A decompiler, or reverse compiler, attempts to reverse the process of a compiler which translates a high-level language program into a binary or ......
Read more >An In-Depth Analysis of Disassembly on Full-Scale x86/x64 ...
(1) We study disassembly on 981 full-scale compiler- generated binaries, to clearly define the true capa- bilities of modern disassemblers ...
Read more >JTR: A binary solution for switch-case recovery - Lucian Cojocar
BRUMLEY, D., LEE, J., SCHWARTZ, E. J., AND WOO, M. Native x86 decompilation us- ing semantics-preserving structural analysis and iterative control-flow ...
Read more >Effective Function Recovery for COTS Binaries using Interface ...
compilers, compiler versions, and compilation switches. Al- though machine learning techniques ... tasks such as decompiling [16, 14, 25], function boundary.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
This was caused partly by switch analysis interacting with the “Eliminate unreachable code” option. You had this toggled off which prevents any branches from getting removed. In particular the conditional branch guarding the switch does not get removed like it ordinarily would be. This is the cause of the “if (true)” syntax. The presence of this extra branch prevents the control-flow around the indirect jump from being structured as a normal switch. It would usually be structured as one case labeled “default”, and 4 other case labels sharing a single destination/body. The extra external branch into the switch prevents this definition from being interpreted as the “default”, and instead it is interpreted as the switch exit. This also causes the “goto” statement.
None of this is technically a bug but just awkward structuring forced by the analysis option. However the decompiler should have enumerated the 4 case labels instead of using the “default” label. A switch where some values cause an immediate jump to the exit cannot also have a “default” case.
Thanks, that fixed this example.
I had removed that option thinking I’d ensure my output hadn’t inadvertantly dropped something important! In this regard, is there (or should there be) a way to get the switch analysis to ignore the “Eliminate unreachable code” option for its operation?