question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Performing use-after-free analysis in angr

See original GitHub issue

Hi We are studying to detect uaf using angr.

Looking at the code below

test.c

#include <stdio.h>
#include <stdlib.h>

typedef struct samplestruct {
        int number;
} sample;

int main(int argc, char* argv[])
{
        int argv1 =0;
        sample *one;
        sample *two;

        one = malloc(256);
        printf("[1] one->number: %d\n", one->number);
        one->number = 54321;
        printf("[2] one->number: %d\n", one->number);

        free(one);
        two = malloc(256);

        printf("[3] two->number: %d\n", two->number);

        argv1 = atoi(argv[1]);

        if(argc >=2 && argv1 > 5) {
                // use-after-free
                two->number = 12345;
                printf("[4] two->number: %d\n", two->number);
        }
        else
                printf("[!] Good..\n");
        free(two);
        return 0;
}

hook_symbol : malloc.py

import simuvex
from simuvex.s_type import SimTypeLength, SimTypeTop
import itertools

######################################
# malloc
######################################

malloc_mem_counter = itertools.count()

class malloc(simuvex.SimProcedure):
    #pylint:disable=arguments-differ

    def run(self, sim_size):
        self.argument_types = {0: SimTypeLength(self.state.arch)}
        self.return_type = self.ty_ptr(SimTypeTop(sim_size))

        if self.state.se.symbolic(sim_size):
            size = self.state.se.max_int(sim_size)
            if size > self.state.libc.max_variable_size:
                size = self.state.libc.max_variable_size
        else:
            size = self.state.se.any_int(sim_size)

        addr = self.state.libc.heap_location
        self.state.libc.heap_location += size
        print "[*] Called malloc 0x%x 0x%x" % (addr, size)
        return addr

hook_symbol : free.py

import simuvex
from simuvex.s_type import SimTypeTop

######################################
# free
######################################
class free(simuvex.SimProcedure):
    #pylint:disable=arguments-differ

    def run(self, ptr): #pylint:disable=unused-argument
        self.argument_types = {0: self.ty_ptr(SimTypeTop())}
        print "[*] Call Free!"
        return self.state.se.Unconstrained('free', self.state.arch.bits)

And angr.py is as follows test.py

import angr, simuvex
import claripy

p = angr.Project("test", load_options={'auto_load_libs':False})

p.analyses.CFG()

num_input_chars = 1
input_str = claripy.BVS("argv1", 8 * num_input_chars)

init_state = p.factory.entry_state(args=["./test", input_str])

############################
#init_state.inspect.b('mem_read')
def debug_func_read(state):
        print "State %s is about to do a memory read! " %state.inspect.mem_read_address

def debug_func_write(state):
        print "State %s is about to do a memory write! " %state.inspect.mem_write_address

init_state.inspect.b('mem_read', when=simuvex.BP_BEFORE, action=debug_func_read)
init_state.inspect.b('mem_write', when=simuvex.BP_BEFORE, action=debug_func_write)


for i in xrange(num_input_chars):
        current_byte = input_str.get_byte(i)
        init_state.add_constraints(
                claripy.Or(
                        claripy.And(current_byte >= 'a', current_byte <= 'z'),
                        claripy.And(current_byte >= 'A', current_byte <= 'Z'),
                        claripy.And(current_byte >= '0', current_byte <= '9')
                )
        )
pg = p.factory.path_group(init_state)
pg.explore(find=0x400716)
assert len(pg.found) > 0
found_state = pg.found[0].state
possible_inputs = found_state.se.any_n_str(input_str, 20)

for input in possible_inputs:
        print input

However, the return of malloc’s address is different than you might expect. Looking at the output log, it appears that malloc is not affected by free.

The log output is as follows.

...

[*] Called malloc 0xc0000f18 0x100
State <BV64 0xc0000f18> is about to do a memory read! //printf("[1] one->number: %d\n", one->number);
State <BV64 0xc0000f18> is about to do a memory write! //one->number = 54321;
State <BV64 0xc0000f18> is about to do a memory read!  //printf("[2] one->number: %d\n", one->number);
..
[*] Called free !!! //free(one);
..
[*] Called malloc 0xc0001018 0x100   // Not the Same !!
State <BV64 0xc0001018> is about to do a memory read!  //two->number = 12345;    // use-after-free
State <BV64 0xc0001018> is about to do a memory write! //printf("[4] two->number: %d\n", two->number);
..
7
8
6
9

If we do not use angr, the program will run as intended.

1 : 0x741010 (malloc return : one->number)
[1] one->number: 0
[2] one->number: 54321
2 : 0x741010 (malloc return : two->number)
[3] two->number: 1551326072
[4] two->number: 12345

However, when using angr, the malloc return value is different, and I wonder whether the free function works.

( 0xc0000f18 ≠ 0xc0001018 )

what’s the problem?

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
rhelmotcommented, Jun 1, 2017

You can see what the malloc and free SimProcedures do, you’ve posted them in the issue! They do not emulate any sane memory allocator. No metadata is stored in memory, and free does nothing. If you want to write a UAF analysis, you should keep in memory a list of freed regions, probably in state.procedure_data.global_variables, and verify that no reads or writes fall into any of them.

0reactions
zarduscommented, Feb 24, 2018

As a side note: you could disable the use of the malloc/free/etc simprocedures and let angr execute the actual heap code. We have done this in prior projects, but you still need to write an analysis to detect the vulnerability cases.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Analyses - angr Documentation
Built-in Analyses ; VFG. Performs VSA on every function of the program, creating a Value Flow Graph and detecting stack variables ; DDG....
Read more >
Extra - Advanced Binary Analysis [CS Open CourseWare]
angr is a very complex framework for symbolic execution. So if you are trying to do something in particular and don't know how,...
Read more >
To Force a Bug: Extending Hybrid Fuzzing - DiVA Portal
The project is meant to summarize the current state of automated binary analysis while creating an open source framework for future research.
Read more >
Finding Semtic Differences in Binary Programs based on Angr
In the paper, we first propose SemDiff, which uses the existing tool(angr) ... and easily perform intensive binary analyses with a couple of...
Read more >
Sys: a Static/Symbolic Tool for Finding Good Bugs in Good ...
Sys first uses a static analysis pass to identify potential ... use-after-free (UAF) bugs. ... Memory In order to perform queries on a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found