question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

I meet a problem when I runs angr to get all paths and their correpoinding test inputs for a given function

See original GitHub issue

Hi

I want to explare all paths of a given function, and then get all cocrete inputs for the paths. I find the caller object can find all paths, so I type commands as following.

import angr
def main():
    b = angr.Project("symbolic_pointer")
    cfg = b.analyses.CFG(keep_state=True)
    target_func = cfg.kb.functions.function(name="test")
    print target_func
    p = b.factory.path()
    x = p.state.memory.load(0x1000, 4)
    y = p.state.memory.load(0x2000, 4)
    c = b.surveyors.Caller(0x4005a6, (0x1000, 0x2000), start=p)
    print tuple(c.iter_returns())
    print c
    print c.found[1]
    state = c.found[3].state
    solution = state.se.any_int(state.memory.load(0x1000, 4))
    print solution
res = main()

The source of tested code is

#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdlib.h>

int test(int* x, int *y)
{
  //int y = atoi(x);
  int a[4] = {0, 1, 2, 3};
  switch (a[*x - 10]) {
  case 0:
    printf("index is 0\n");
    return 0;
  case 1:
    printf("index is 1\n");
    return 1;
  case 2:
    printf("index is 2\n");
    return 2;
  case 3:
    printf("index is 3\n");
    return 3;
  default:
    return 4; 
  } 
}
void main()
{  
  int x = 0, y = 0;
  test(&x, &y);
}

However, I got the same value of varaible x for all paths. Below is the testing result.

Function test [0x4005a6]
  Syscall: False
  SP difference: 0
  Has return: True
  Returning: True
  Arguments: reg: [72], stack: [8L, 0L]
  Blocks: [0x400660, 0x400601, 0x400623, 0x4005a6, 0x400606, 0x400651, 0x400608, 0x40062a, 0x40064c, 0x400645, 0x4005f1, 0x400612, 0x400634, 0x4005f6, 0x400619, 0x4005fa, 0x40063b, 0x4005fc, 0x400665]
  Calling convention: System V AMD64 - AMD64 [<rdi>, [8h]]
((<BV64 0x4>, <Path with 6 runs (at 0x4004b0)>), (<BV64 0x1>, <Path with 7 runs (at 0x4004b0)>), (<BV64 0x4>, <Path with 7 runs (at 0x4004b0)>), (<BV64 0x3>, <Path with 10 runs (at 0x4004b0)>), (<BV64 0x2>, <Path with 9 runs (at 0x4004b0)>), (<BV64 0x0>, <Path with 9 runs (at 0x4004b0)>))
<Explorer with paths: 0 active, 0 spilled, 0 deadended, 0 errored, 0 unconstrained, 6 found, 0 avoided, 0 deviating, 0 looping, 0 lost>
<Path with 7 runs (at 0x4004b0)>
167772288

That is to say, I got the same value 167772288 no matter I change state = c.found[3].state to state = c.found[0].state, state = c.found[1].state or any other paths.

Am I using Angr in an incorrect way? Or the caller object is not capable of getting concrete test inputs?

Regards

Ting Chen

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:14 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
rhelmotcommented, May 10, 2016

It’s a little hard to follow your example, but you have to remember that C code is, in the end, just an abstraction over assembly code, and angr’s analysis runs on assembly code. “uninitialized” means that there is no known value for the memory when you read from it. By default in angr, all reads from uninitialized values return unconstrained symbolic variables.

So, if you have a pointer to an array and you use it to access a[-1], the result will depend on the memory layout of the program. If something else is using the memory pointed to by &a[-1], then the read will return that value, or if a previous function call used that region of memory as part of its stack frame, then you’ll get back whatever value was written there previously. However, if nothing has ever been written to that address before, you’ll get back a symbolic value. This makes sense because when you’re analyzing a program with incomplete state, like in your example where you start execution from the start of the function, it’s possible that the value at that address could be user-controlled from a previous function call.

1reaction
ltfishcommented, May 8, 2016

In your original example, the input variable x of function test is 0x1000, which should be an uninitialized memory cell (I didn’t see you initializing it anywhere in your script). That means *x is fully unconstrained (can be any value), and &a[*x - 10] could point anywhere in the memory. If &a[*x - 10] points to an uninitialized memory cell, then a[*x - 10] is again a fully unconstrained symbolic value, which will be constrained to different values (like 0, 1, 2, 3) based on different paths that angr executes afterwards. However, a[*x - 10] == 0 does not necessarily mean *x == 10, since it still holds when &a[*x - 10] points to an unconstrained memory cell. Therefore angr is giving you the expected result.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Examples - angr Documentation
This is a basic script that explains how to use angr to symbolically execute a program and produce concrete input satisfying certain conditions....
Read more >
How to use the angr.Project function in angr - Snyk
To help you get started, we've selected a few angr.Project examples, based on popular ways it is used in public projects.
Read more >
Tut10-2: Symbolic Execution - CS6265: Information Security Lab
Generally, a program is "concretely" executed; it handles concrete values, e.g., an input value given by a user, and its behavior depends on...
Read more >
Handle function calls during static analysis in angr
This talk was meant to be a hands-on introduction on data-flow analysis, using angr , to find “taint-style” vulnerabilities in binaries.
Read more >
Introduction to angr Part 3 - not so pro
We can see that the first block sets up the stack and calls scanf() . We know that it takes as input a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found