黑客

Angr

2018-09-06  本文已影响2160人  Black_Sun

Angr 基本介绍

Angr可以干什么?

Angr安装

# dependency
sudo apt-get install python-dev libffi-dev build-essential virtualenvwrapper
# install
# we'd better use it in virtual environment
mkvirtualenv angr && pip install angr
# more see https://docs.angr.io/INSTALL.html

ubuntu 16.04 安装

virtualenvwrapper是一个Python虚拟环境,使用虚拟环境的主要原因是angr会修改libz3和libVEX,可能会影响其他程序的正常使用。
新建一个Python虚拟机环境:

$ export WORKON_HOME=~/Envs
$ mkdir -p $WORKON_HOME
$ source /usr/share/virtualenvwrapper/virtualenvwrapper.sh
$ mkvirtualenv angr

基本操作

Project

用来加载 binary,是使用 angr 的基础。

>>> import angr
>>> proj = angr.Project('/bin/true')

基本属性

查看对应binary的基本属性

>>> proj.arch
<Arch AMD64 (LE)>
>>> proj.entry
0x401670
>>> proj.filename
'/bin/true'
>>> proj.loader
<Loaded true, maps [0x400000:0x5004000]>

>>> proj.loader.shared_objects # may look a little different for you!
{'ld-linux-x86-64.so.2': <ELF Object ld-2.24.so, maps [0x2000000:0x2227167]>,
 'libc.so.6': <ELF Object libc-2.24.so, maps [0x1000000:0x13c699f]>}

>>> proj.loader.min_addr
0x400000
>>> proj.loader.max_addr
0x5004000

>>> proj.loader.main_object  # we've loaded several binaries into this project. Here's the main one!
<ELF Object true, maps [0x400000:0x60721f]>

>>> proj.loader.main_object.execstack  # sample query: does this binary have an executable stack?
False
>>> proj.loader.main_object.pic  # sample query: is this binary position-independent?
True

加载选项

基本选项

高级选项

angr.Project(main_opts={'backend': 'ida', 'custom_arch': 'i386'}, lib_opts={'libc.so.6': {'backend': 'elf'}})

Loader

loader (CLE Load Everything,CLE)用于将一个 binary 加载到对应的虚拟地址空间。每类 binary 都有对应的加载器后端 (cle.Backend)。比如 cle.ELF 用来加载ELF文件。此外,angr 加载的 binary 都有自己的内存空间,但是并不是内存空间中每一个对象都会有对应的binary。

主加载对象信息

我们可以得出 loader 加载的主对象的基本信息

# This is the "main" object, the one that you directly specified when loading the project
>>> proj.loader.main_object
<ELF Object true, maps [0x400000:0x60105f]>
>>> obj = proj.loader.main_object

# The entry point of the object
>>> obj.entry
0x400580

>>> obj.min_addr, obj.max_addr
(0x400000, 0x60105f)

# Retrieve this ELF's segments and sections
>>> obj.segments
<Regions: [<ELFSegment offset=0x0, flags=0x5, filesize=0xa74, vaddr=0x400000, memsize=0xa74>,
           <ELFSegment offset=0xe28, flags=0x6, filesize=0x228, vaddr=0x600e28, memsize=0x238>]>
>>> obj.sections
<Regions: [<Unnamed | offset 0x0, vaddr 0x0, size 0x0>,
           <.interp | offset 0x238, vaddr 0x400238, size 0x1c>,
           <.note.ABI-tag | offset 0x254, vaddr 0x400254, size 0x20>,
            ...etc

# You can get an individual segment or section by an address it contains:
>>> obj.find_segment_containing(obj.entry)
<ELFSegment offset=0x0, flags=0x5, filesize=0xa74, vaddr=0x400000, memsize=0xa74>
>>> obj.find_section_containing(obj.entry)
<.text | offset 0x580, vaddr 0x400580, size 0x338>

# Get the address of the PLT stub for a symbol
>>> addr = obj.plt['__libc_start_main']
>>> addr
0x400540
>>> obj.reverse_plt[addr]
'__libc_start_main'

# Show the prelinked base of the object and the location it was actually mapped into memory by CLE
>>> obj.linked_base
0x400000
>>> obj.mapped_base
0x400000

其它加载对象信息

# All loaded objects
>>> proj.loader.all_objects
[<ELF Object fauxware, maps [0x400000:0x60105f]>,
 <ELF Object libc.so.6, maps [0x1000000:0x13c42bf]>,
 <ELF Object ld-linux-x86-64.so.2, maps [0x2000000:0x22241c7]>,
 <ELFTLSObject Object cle##tls, maps [0x3000000:0x300d010]>,
 <KernelObject Object cle##kernel, maps [0x4000000:0x4008000]>,
 <ExternObject Object cle##externs, maps [0x5000000:0x5008000]>


# This is a dictionary mapping from shared object name to object
>>> proj.loader.shared_objects
{ 'libc.so.6': <ELF Object libc.so.6, maps [0x1000000:0x13c42bf]>
  'ld-linux-x86-64.so.2': <ELF Object ld-linux-x86-64.so.2, maps [0x2000000:0x22241c7]>}

# Here's all the objects that were loaded from ELF files
# If this were a windows program we'd use all_pe_objects!
>>> proj.loader.all_elf_objects
[<ELF Object true, maps [0x400000:0x60105f]>,
 <ELF Object libc.so.6, maps [0x1000000:0x13c42bf]>,
 <ELF Object ld-linux-x86-64.so.2, maps [0x2000000:0x22241c7]>]

# Here's the "externs object", which we use to provide addresses for unresolved imports and angr internals
>>> proj.loader.extern_object
<ExternObject Object cle##externs, maps [0x5000000:0x5008000]>

# This object is used to provide addresses for emulated syscalls
>>> proj.loader.kernel_object
<KernelObject Object cle##kernel, maps [0x4000000:0x4008000]>

# Finally, you can to get a reference to an object given an address in it
>>> proj.loader.find_object_containing(0x400000)
<ELF Object true, maps [0x400000:0x60105f]>

符号以及重定位信息

我们还可以使用 CLE 来操作二进制文件中的符号。

>>> malloc = proj.loader.find_symbol('malloc')
>>> malloc
<Symbol "malloc" in libc.so.6 at 0x1054400>
>>> malloc.name
'malloc'

>>> malloc.owner_obj
<ELF Object libc.so.6, maps [0x1000000:0x13c42bf]>

# .rebased_addr is its address in the global address space. This is what is shown in the print output.
>>> malloc.rebased_addr
0x1054400
# .linked_addr is its address relative to the prelinked base of the binary. This is the address reported in, for example, readelf(1)
>>> malloc.linked_addr
0x54400
# .relative_addr is its address relative to the object base. This is known in the literature (particularly the Windows literature) as an RVA (relative virtual address).
>>> malloc.relative_addr
0x54400
>>> malloc.is_export
True
>>> malloc.is_import
False

# On Loader, the method is find_symbol because it performs a search operation to find the symbol.
# On an individual object, the method is get_symbol because there can only be one symbol with a given name.
>>> main_malloc = proj.loader.main_object.get_symbol("malloc")
>>> main_malloc
<Symbol "malloc" in true (import)>
>>> main_malloc.is_export
False
>>> main_malloc.is_import
True
>>> main_malloc.resolvedby
<Symbol "malloc" in libc.so.6 at 0x1054400>

后端

backend name description requires custom_arch?
elf Static loader for ELF files based on PyELFTools no
pe Static loader for PE files based on PEFile no
mach-o Static loader for Mach-O files. Does not support dynamic linking or rebasing. no
cgc Static loader for Cyber Grand Challenge binaries no
backedcgc Static loader for CGC binaries that allows specifying memory and register backers no
elfcore Static loader for ELF core dumps no
ida Launches an instance of IDA to parse the file yes
blob Loads the file into memory as a flat image yes

Symbolic Function

默认情况下,angr 会尝试将程序中调用的库函数用自己模拟的函数来代替,这些函数一般对应的对象为SimProcedures 。我们可以从 angr.SIM_PROCEDURES 中找到所有的函数。这些函数的命名规范为package name(libc, posix, win32, etc...)+function name。

需要注意的是

hook

hook 指定的函数,使得angr执行自己给定的函数。

>>> stub_func = angr.SIM_PROCEDURES['stubs']['ReturnUnconstrained'] # this is a CLASS
>>> proj.hook(0x10000, stub_func())  # hook with an instance of the class

>>> proj.is_hooked(0x10000)            # these functions should be pretty self-explanitory
True
>>> proj.unhook(0x10000)
>>> proj.hooked_by(0x10000)
<ReturnUnconstrained>

# length keyword argument to make execution jump some number of bytes forward after your hook finishes.
>>> @proj.hook(0x20000, length=5)
... def my_hook(state):
...     state.regs.rax = 1

>>> proj.is_hooked(0x20000)
True

factory

原因

方法

基本块

属性

方法

state

project 只是给出程序最初镜像的信息,state 可以给出模拟程序执行到某条指令时的进程的具体状态。在 angr 中,则使用 SimState 来描述。

预置执行状态

我们可以根据 factory 来设置程序执行到指定地址的默认状态。

基本状态信息

寄存器

内存

模式:state.mem[addr].type.xxx

>>> import angr
>>> proj = angr.Project('/bin/true')
>>> state = proj.factory.entry_state()

# copy rsp to rbp
>>> state.regs.rbp = state.regs.rsp

# store rdx to memory at 0x1000
>>> state.mem[0x1000].uint64_t = state.regs.rdx

# dereference rbp
>>> state.regs.rbp = state.mem[state.regs.rbp].uint64_t.resolved

# add rax, qword ptr [rsp + 8]
>>> state.regs.rax += state.mem[state.regs.rsp + 8].uint64_t.resolved

文件系统

执行

基本执行

>>> proj = angr.Project('examples/fauxware/fauxware')
>>> state = proj.factory.entry_state()
>>> while True:
...     succ = state.step()
...     if len(succ.successors) == 2:
...         break
...     state = succ.successors[0]

>>> state1, state2 = succ.successors
>>> state1
<SimState @ 0x400629>
>>> state2
<SimState @ 0x400699>

低层次内存访问

>>> s = proj.factory.blank_state()
>>> s.memory.store(0x4000, s.solver.BVV(0x0123456789abcdef0123456789abcdef, 128))
>>> s.memory.load(0x4004, 6) # load-size is in bytes
<BV48 0x89abcdef0123>
>>> import archinfo
>>> s.memory.load(0x4000, 4, endness=archinfo.Endness.LE)
<BV32 0x67453201>

State Option

# Example: enable lazy solves, an option that causes state satisfiability to be checked as infrequently as possible.
# This change to the settings will be propagated to all successor states created from this state after this line.
>>> s.options.add(angr.options.LAZY_SOLVES)

# Create a new state with lazy solves enabled
>>> s = proj.factory.entry_state(add_options={angr.options.LAZY_SOLVES})

# Create a new state without simplification options enabled
>>> s = proj.factory.entry_state(remove_options=angr.options.simplification)

solver

solver 基本就是一个约束求解引擎。

操作位向量

位向量与 python 中的整形的转换。

# 64-bit bitvectors with concrete values 1 and 100
>>> one = state.solver.BVV(1, 64)
>>> one
 <BV64 0x1>
>>> one_hundred = state.solver.BVV(100, 64)
>>> one_hundred
 <BV64 0x64>

# create a 27-bit bitvector with concrete value 9
>>> weird_nine = state.solver.BVV(9, 27)
>>> weird_nine
<BV27 0x9>
>>> one + one_hundred
<BV64 0x65>

# You can provide normal python integers and they will be coerced to the appropriate type:
>>> one_hundred + 0x100
<BV64 0x164>

# The semantics of normal wrapping arithmetic apply
>>> one_hundred - one*200
<BV64 0xffffffffffffff9c>

# use extend to extent the length of bitvector
# also there is sign_extend
>>> weird_nine.zero_extend(64 - 27)
<BV64 0x9>
>>> one + weird_nine.zero_extend(64 - 27)
<BV64 0xa>
# Create a bitvector symbol named "x" of length 64 bits
>>> x = state.solver.BVS("x", 64)
>>> x
<BV64 x_9_64>
>>> y = state.solver.BVS("y", 64)
>>> y
<BV64 y_10_64>
>>> x + one
<BV64 x_9_64 + 0x1>

>>> (x + one) / 2
<BV64 (x_9_64 + 0x1) / 0x2>

>>> x - y
<BV64 x_9_64 - y_10_64>

AST 查看

>>> tree = (x + 1) / (y + 2)
>>> tree
<BV64 (x_9_64 + 0x1) / (y_10_64 + 0x2)>
>>> tree.op
'__div__'
>>> tree.args
(<BV64 x_9_64 + 0x1>, <BV64 y_10_64 + 0x2>)
>>> tree.args[0].op
'__add__'
>>> tree.args[0].args
(<BV64 x_9_64>, <BV64 0x1>)
>>> tree.args[0].args[1].op
'BVV'
>>> tree.args[0].args[1].args
(1, 64)

符号约束

>>> x == 1
<Bool x_9_64 == 0x1>
>>> x == one
<Bool x_9_64 == 0x1>
>>> x > 2
<Bool x_9_64 > 0x2>
>>> x + y == one_hundred + 5
<Bool (x_9_64 + y_10_64) == 0x69>
>>> one_hundred > 5
<Bool True>
>>> one_hundred > -5
<Bool False>
>>> yes = one == 1
>>> no = one == 2
>>> maybe = x == y
>>> state.solver.is_true(yes)
True
>>> state.solver.is_false(yes)
False
>>> state.solver.is_true(no)
False
>>> state.solver.is_false(no)
True
>>> state.solver.is_true(maybe)
False
>>> state.solver.is_false(maybe)
False

约束求解

基本步骤

>>> state.solver.add(x > y)
>>> state.solver.add(y > 2)
>>> state.solver.add(10 > x)
>>> state.solver.eval(x)
4

# get a fresh state without constraints
>>> state = proj.factory.entry_state()
>>> input = state.solver.BVS('input', 64)
>>> operation = (((input + 4) * 3) >> 1) + input
>>> output = 200
>>> state.solver.add(operation == output)
>>> state.solver.eval(input)
0x3333333333333381
# If we add conflicting or contradictory constraints
>>> state.solver.add(input < 2**32)
>>> state.satisfiable()
False

Simulation Managers

我们用 state 来描述程序执行到某个地址时程序的具体状态。同时,我们使用 Simulation Managers 来管理程序如何由一个状态到另一个状态。它是 angr 中模拟控制程序的重要接口。

创建模拟管理器

>>> simgr = proj.factory.simgr(state) # TODO: change name before merge
<SimulationManager with 1 active>

查看状态信息

对于一个管理器来说,它可以存储多个状态,自然也可以查看每个状态的具体信息。其中 active 状态由我们默认传入的状态初始化得到。

>>> simgr.active
[<SimState @ 0x401670>]
>>> simgr.active[0].regs.rip                 # new and exciting!
<BV64 0x1020300>
>>> state.regs.rip                           # still the same!
<BV64 0x401670>

执行

执行一个基本块,这并不会修改最初的时候传入的状态。

>>> simgr.step()

# Step until the first symbolic branch
>>> while len(simgr.active) == 1:
...    simgr.step()

>>> simgr
<SimulationManager with 2 active>
>>> simgr.active
[<SimState @ 0x400692>, <SimState @ 0x400699>]

# Step until everything terminates
>>> simgr.run()
>>> simgr
<SimulationManager with 3 deadended>

Stash Management

>>> simgr.move(from_stash='deadended', to_stash='authenticated', filter_func=lambda s: 'Welcome' in s.posix.dumps(1))
>>> simgr
<SimulationManager with 2 authenticated, 1 deadended>
>>> for s in simgr.deadended + simgr.authenticated:
...     print hex(s.addr)
0x1000030
0x1000078
0x1000078
# If you prepend the name of a stash with one_, you will be given the first state in the stash. 
>>> simgr.one_deadended
<SimState @ 0x1000030>
#  If you prepend the name of a stash with mp_, you will be given a mulpyplexed version of the stash.
>>> simgr.mp_authenticated
MP([<SimState @ 0x1000078>, <SimState @ 0x1000078>])
>>> simgr.mp_authenticated.posix.dumps(0)
MP(['\x00\x00\x00\x00\x00\x00\x00\x00\x00SOSNEAKY\x00',
    '\x00\x00\x00\x00\x00\x00\x00\x00\x00S\x80\x80\x80\x80@\x80@\x00'])

explore!!!!

寻找到达指定地址时程序的状态。 一般会有一个find参数

对于找到的状态会放在 find 对应的 store 中。

同时,也可以在explore中添加avoid条件,即避免 angr 探索这些对应的地址。

>>> proj = angr.Project('examples/CSCI-4968-MBE/challenges/crackme0x00a/crackme0x00a')
>>> simgr = proj.factory.simgr()
>>> simgr.explore(find=lambda s: "Congrats" in s.posix.dumps(1))
<SimulationManager with 1 active, 1 found>
>>> s = simgr.found[0]
>>> print s.posix.dumps(1)
Enter password: Congrats!

>>> flag = s.posix.dumps(0)
>>> print(flag)
g00dJ0B!

extra

Stash Description
active This stash contains the states that will be stepped by default, unless an alternate stash is specified.
deadended A state goes to the deadended stash when it cannot continue the execution for some reason, including no more valid instructions, unsat state of all of its successors, or an invalid instruction pointer.
pruned When using LAZY_SOLVES, states are not checked for satisfiability unless absolutely necessary. When a state is found to be unsat in the presence of LAZY_SOLVES, the state hierarchy is traversed to identify when, in its history, it initially became unsat. All states that are descendants of that point (which will also be unsat, since a state cannot become un-unsat) are pruned and put in this stash.( 使用LAZY_SOLVES时,不检查可满足性,当一个状态在LAZY_SOLVES之前就被抛弃时,当被遍历去识别这个状态的时候,直到找到一个不能被抛弃的节点。修剪到这个节点,并将这个状态存起来。)
unconstrained If the save_unconstrained option is provided to the SimulationManager constructor, states that are determined to be unconstrained (i.e., with the instruction pointer controlled by user data or some other source of symbolic data) are placed here.(这个save_unconstrained选项被SMC激活,状态不在被约束,指令将会用户数据和一些其它的符号数据源控制)
unsat If the save_unsat option is provided to the SimulationManager constructor, states that are determined to be unsatisfiable (i.e., they have constraints that are contradictory, like the input having to be both "AAAA" and "BBBB" at the same time) are placed here. (save_unsat表示状态的满足条件)

analysis

给出程序的各种分析信息。

如控制流图

# Originally, when we loaded this binary it also loaded all its dependencies into the same virtual address  space
# This is undesirable for most analysis.
>>> proj = angr.Project('/bin/true', auto_load_libs=False)
>>> cfg = proj.analyses.CFGFast()
<CFGFast Analysis Result at 0x2d85130>

# cfg.graph is a networkx DiGraph full of CFGNode instances
# You should go look up the networkx APIs to learn how to use this!
>>> cfg.graph
<networkx.classes.digraph.DiGraph at 0x2da43a0>
>>> len(cfg.graph.nodes())
951

# To get the CFGNode for a given address, use cfg.get_any_node
>>> entry_node = cfg.get_any_node(proj.entry)
>>> len(list(cfg.graph.successors(entry_node)))
2

class angr.block.CapstoneBlock(addr, insns, thumb, arch)

Deep copy of the capstone blocks, which have serious issues with having extended lifespans outside of capstone itself
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////


angr-0ctf_momo

1.需要逆向找到约束求解的三个条件

dx, dword ptr [edx*4 + 0x81fe260]
al, byte ptr [0x81fe6e0]
dl, byte ptr [0x81fe6e4]

2.需要掌握“逆向MoVfuscator编译程序”能力

1.使用qira+ida进行人工分析,
2.或使用“movfuscator的反混淆器”
3.使用Makefile+二进制插桩
4.angr求解是建立在对程序逆向的理解程度

3.angr约束求解的过程,有一部分还理解的不是很清楚

参考网站:
1:angr学习(四):
http://www.cnblogs.com/fancystar/p/7893248.html
2:Makefile+二进制插桩:
https://blog.xy14qg.top/2016/0ctf-2016-writeup/#momo-reverse
3:angr用例解析——0ctf_momo_3:
http://blog.csdn.net/doudoudouzoule/article/details/79537019
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
相关资料:
在网上发现一个开源项目,https://github.com/kirschju/demovfuscator 是专门来应该movfuscator的反混淆器,果断安装
momo使用qira解决movfuscator
http://blog.csdn.net/charlie_heng/article/details/79206863

上一篇 下一篇

猜你喜欢

热点阅读