[angr] Angr performance

spark at trendmicro.com spark at trendmicro.com
Sun Jan 17 15:04:13 PST 2016


My email bounced back. Retrying with the public mailing list.

From: "Sean Park (RD-AU)" <spark at trendmicro.com<mailto:spark at trendmicro.com>>
Date: Monday, 18 January 2016 10:02 am
To: angr <angr at lists.cs.ucsb.edu<mailto:angr at lists.cs.ucsb.edu>>
Subject: Angr performance

Hi All,

I know angr was not designed specifically for malware analysis. I’m just trying to figure out what angr component actually impacts the performance since I saw it takes about 1 min to XOR-decode 660bytes of data with no symbolic initial condition set at the entry point. In real life, malware unpacks tens or hundreds of kilobytes at runtime in multiple layers at arbitrary locations. So this performance overhead is difficult to tolerate from malware analysis standpoint.

  *   Does the performance overhead mainly stem from IR translation at each instruction (also with SYMBOLIC_INITIAL_VALUES flag) ?
  *   TRACK_ACTION_HISTORY flag is not enabled by default. So taint tracking (DO_XXX, TRACK_MEMORY_XXX, TRACK_REGISTER_XXX) isn’t the major contributor of this overhead. Please comment.
  *   Is it TRACK_CONSTRAINT_ACTIONS that causes this overhead?

It will be much appreciated if you enlighten me with the cause of performance overhead. I’m only trying to understand the problem.

Sean

From: "Sean Park (RD-AU)" <spark at trendmicro.com<mailto:spark at trendmicro.com>>
Date: Wednesday, 13 January 2016 9:35 am
To: Yan <zardus at gmail.com<mailto:zardus at gmail.com>>, Fish Wang <fish at cs.ucsb.edu<mailto:fish at cs.ucsb.edu>>
Cc: "Sean Park (RD-AU)" <spark at trendmicro.com<mailto:spark at trendmicro.com>>, angr <angr at lists.cs.ucsb.edu<mailto:angr at lists.cs.ucsb.edu>>
Subject: Re: [angr] CFG for self-modifying code

Thanks for your comment, Yan.

I am trying to create CFG for an arbitrary piece of malware or shellcode, in which case you wouldn’t know at which point the code will be unpacked or how many layers there are. I would go with Fish’s suggestion since that’s a more strategic approach. I will figure it out and let you know guys how I go.

Cheers,
Sean

From: Yan <zardus at gmail.com<mailto:zardus at gmail.com>>
Date: Tuesday, 12 January 2016 1:54 pm
To: Fish Wang <fish at cs.ucsb.edu<mailto:fish at cs.ucsb.edu>>
Cc: "Sean Park (RD-AU)" <spark at trendmicro.com<mailto:spark at trendmicro.com>>, angr <angr at lists.cs.ucsb.edu<mailto:angr at lists.cs.ucsb.edu>>
Subject: Re: [angr] CFG for self-modifying code

If you know when the shellcode is fully unpacked (if such a point exists), you can push the memory contents back into CLE (Andrew can probably give you some pointers on doing this) and then simply treat it as a separate program with a different entry point. It could be cool to have official API support for such an action, actually (if you want to get your hands dirty and send along a PR!).

- Yan

On Mon, Jan 11, 2016 at 6:51 PM, Fish Wang <fish at cs.ucsb.edu<mailto:fish at cs.ucsb.edu>> wrote:
Hi Sean,

CFG does not support self-modifying code right now (since it’s pure static analysis). You might want to use symbolic execution in angr to execute or dump all the shellcode being executed. With that information, it’s very easy to show addresses, instructions, and even states of everything along the path. If you want to generate a CFG for self-modifying code, you really have to loyally simulate the execution, which is difficult for a static analysis to do.

We’ve done it for some CTF challenges (that has some simple unpacking or self-modification mechanisms). They are not included in the angr-doc repo though, sorry :-(

Best,
Fish

From: angr [mailto:angr-bounces at lists.cs.ucsb.edu<mailto:angr-bounces at lists.cs.ucsb.edu>] On Behalf Of spark at trendmicro.com<mailto:spark at trendmicro.com>
Sent: Monday, January 11, 2016 7:58 PM
To: angr at lists.cs.ucsb.edu<mailto:angr at lists.cs.ucsb.edu>
Subject: [angr] CFG for self-modifying code

Hi people,

I was trying to get CFG for a self-modifying shellcode. I used the following code.

project = angr.Project('shellcode.exe', support_selfmodifying_code=True, load_options={'auto_load_libs':False})
cfg = project.analyses.CFG(keep_state=True, enable_symbolic_back_traversal=True)

It appears angr creates a CFG for the original code instead of the modified code. Is there any way to get a CFG by symbolically executing the code? Any example code to do this showing address and disassembly for each path will be much appreciated.

Regards,
Sean



TREND MICRO EMAIL NOTICE

The information contained in this email and any attachments is confidential

and may be subject to copyright or other intellectual property protection.

If you are not the intended recipient, you are not authorized to use or

disclose this information, and we request that you notify us by reply mail or

telephone and delete the original message from your mail system.




_______________________________________________
angr mailing list
angr at lists.cs.ucsb.edu<mailto:angr at lists.cs.ucsb.edu>
https://lists.cs.ucsb.edu/mailman/listinfo/angr



<table class="TM_EMAIL_NOTICE"><tr><td><pre>
TREND MICRO EMAIL NOTICE
The information contained in this email and any attachments is confidential 
and may be subject to copyright or other intellectual property protection. 
If you are not the intended recipient, you are not authorized to use or 
disclose this information, and we request that you notify us by reply mail or
telephone and delete the original message from your mail system.
</pre></td></tr></table>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.ucsb.edu/pipermail/angr/attachments/20160117/2b1a1cad/attachment-0001.html>


More information about the angr mailing list