question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Decompiler misinterpreting x86 'switch' boundaries

See original GitHub issue

Describe the bug In the following assembly (from address 1008:151f onwards), there is a guard around the effective switch statement that ensures values outside the range 2 < AX <= 9 use the same branch as caseD_4:

   1008:1510 81 3e a2        CMP        word ptr [g_nUnreadSamplesInBuffer_24a2],498
             24 f2 01
   1008:1516 7d 07           JGE        LAB_1008_151f
   1008:1518 83 3e 72        CMP        word ptr [DAT_1030_0072],0x0
             00 00
   1008:151d 74 31           JZ         LAB_1008_1550
                             LAB_1008_151f                                   XREF[1]:     1008:1516(j)  
   1008:151f c4 5e fa        LES        aBuffer,[BP + pwf]
   1008:1522 26 8a 07        MOV        AL,byte ptr ES:[aBuffer]
   1008:1525 98              CBW
   1008:1526 2d 02 00        SUB        AX,0x2
   1008:1529 8b d8           MOV        aBuffer,AX
   1008:152b 83 fb 07        CMP        aBuffer,0x7
   1008:152e 77 1a           JA         switchD_1008:1532::caseD_4
   1008:1530 03 db           ADD        aBuffer,aBuffer
                             switchD_1008:1532::switchD
   1008:1532 2e ff a7        JMP        word ptr CS:[aBuffer + 0x15f4]
             f4 15
                             switchD_1008:1532::caseD_3                      XREF[1]:     1008:1532(j)  
                             switchD_1008:1532::caseD_5
                             switchD_1008:1532::caseD_9
                             switchD_1008:1532::caseD_2
   1008:1537 c7 06 72        MOV        word ptr [DAT_1030_0072],0x0
             00 00 00
   1008:153d c7 06 b0        MOV        word ptr [BOOL_1030_12b0],0x0
             12 00 00
   1008:1543 9a 80 17        CALLF      FUN_1010_1780                                    undefined2 FUN_1010_1780(void)
             10 10
   1008:1548 eb 06           JMP        LAB_1008_1550
                             switchD_1008:1532::caseD_6                      XREF[2]:     1008:152e(j), 1008:1532(j)  
                             switchD_1008:1532::caseD_7
                             switchD_1008:1532::caseD_8
                             switchD_1008:1532::caseD_4
   1008:154a c7 06 72        MOV        word ptr [DAT_1030_0072],0x1
             00 01 00

However, in the decompilation the cases aren’t setup at all and the ‘guard’ becomes an if (true). Here’s the decompilation:

   if ((497 < g_nUnreadSamplesInBuffer_24a2) || (DAT_1030_0072 != 0)) {
      if (true) {
         switch(*pwf) {
         default:
            DAT_1030_0072 = 0;
            BOOL_1030_12b0 = (0 | FALSE);
            FUN_1010_1780();
            goto LAB_1008_1550;
         }
      }
      DAT_1030_0072 = 1;
   }
LAB_1008_1550:
   if (DAT_1030_0072 == 0) {
.
.
.

Expected behavior The output should recover the correct braches and guard condition given the accurate assembly.

Environment (please complete the following information):

  • OS: Windows 10
  • Java Version: 11.0.2
  • Ghidra Version: 10.1.4
  • Ghidra Origin: locally built

Additional context Add any other context about the problem here.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
caheckmancommented, Jun 10, 2022

This was caused partly by switch analysis interacting with the “Eliminate unreachable code” option. You had this toggled off which prevents any branches from getting removed. In particular the conditional branch guarding the switch does not get removed like it ordinarily would be. This is the cause of the “if (true)” syntax. The presence of this extra branch prevents the control-flow around the indirect jump from being structured as a normal switch. It would usually be structured as one case labeled “default”, and 4 other case labels sharing a single destination/body. The extra external branch into the switch prevents this definition from being interpreted as the “default”, and instead it is interpreted as the switch exit. This also causes the “goto” statement.

None of this is technically a bug but just awkward structuring forced by the analysis option. However the decompiler should have enumerated the 4 case labels instead of using the “default” label. A switch where some values cause an immediate jump to the exit cannot also have a “default” case.

0reactions
Wall-AFcommented, Jun 10, 2022

This was caused partly by switch analysis interacting with the “Eliminate unreachable code” option.

Thanks, that fixed this example.

I had removed that option thinking I’d ensure my output hadn’t inadvertantly dropped something important! In this regard, is there (or should there be) a way to get the switch analysis to ignore the “Eliminate unreachable code” option for its operation?

Read more comments on GitHub >

github_iconTop Results From Across the Web

GAS x86: Reading a jump table / interpreting a switch statement
The jump table is an array of pointers, not code. The disassembler doesn't know that, so it decodes the bytes as instructions.
Read more >
Is there a tool capable of reconstructing structured code from ...
A decompiler, or reverse compiler, attempts to reverse the process of a compiler which translates a high-level language program into a binary or ......
Read more >
An In-Depth Analysis of Disassembly on Full-Scale x86/x64 ...
(1) We study disassembly on 981 full-scale compiler- generated binaries, to clearly define the true capa- bilities of modern disassemblers ...
Read more >
JTR: A binary solution for switch-case recovery - Lucian Cojocar
BRUMLEY, D., LEE, J., SCHWARTZ, E. J., AND WOO, M. Native x86 decompilation us- ing semantics-preserving structural analysis and iterative control-flow ...
Read more >
Effective Function Recovery for COTS Binaries using Interface ...
compilers, compiler versions, and compilation switches. Al- though machine learning techniques ... tasks such as decompiling [16, 14, 25], function boundary.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found