angr用法解析和常见用法实战_python

angr用法解析和常见用法实战

文章目录

angr用法解析和常见用法实战
- 安装
- - 常见报错
  - - libgomp.so.1: version GOMP_4.0 not found, or other z3 issues
    - No such file or directory: 'pyvex_c'
    - AttributeError: 'FFI' object has no attribute 'unpack'
    - angr has no attribute Project, or similar
    - AttributeError: 'module' object has no attribute 'KS_ARCH_X86'
    - No such file or directory: 'libunicorn.dylib'
    - pthread check failed: Make sure to have the pthread libs and headers installed.
- 快速开始
- - loader 静态加载
  - The factory `angr`的最主要的函数入口
  - - Blocks 指定地址的基本块
    - States 模拟器
    - Simulation Managers
  - Analyses 静态分析信息
- 实战演示
- 其他文档
- 相关逆向CTF题

angr 是一个多架构二进制分析工具包，具有执行动态符号执行（如 Mayhem、KLEE 等）和对二进制文件进行各种静态分析的能力。

使用Python进行二进制分析。

将二进制转换为通用语言 (IR)。
二进制分析
部分或全部程序静态分析（例如依赖分析、程序切片）
对程序状态空间的象征性探索（即“我们可以执行它直到发现溢出吗？”）

安装

Windows：pip install angr

CentOS/Ubuntu：sudo apt-get install python3-dev libffi-dev build-essential virtualenvwrapper、pip install angr

以上需要Python3.8以上环境

docker：

# install docker
curl -sSL https://get.docker.com/ | sudo sh

# pull the docker image
sudo docker pull angr/angr

# run it
sudo docker run -it angr/angr

常见报错 libgomp.so.1: version GOMP_4.0 not found, or other z3 issues

libz3.so 的预编译版本与 .so 的安装版本之间的不兼容libgomp。需要重新编译 Z3。

pip install -I --no-binary z3-solver z3-solver

No such file or directory: ‘pyvex_c’

Ubuntu 12.04太老了，需要升级升级 pip python -m pip install -U pip

AttributeError: ‘FFI’ object has no attribute ‘unpack’

太老的cffiPython 模块版本。angr至少需要 1.7 版的 cffi。试试pip install --upgrade cffi。如果问题仍然存在，确认一下 *** 作系统没有预装旧版本的 cffi，pip 可能会拒绝卸载。如果您使用带有 pypy 解释器的 Python 虚拟环境，则安装最新版本的 pypy，因为它包含 pip 不会升级的 cffi 版本。

angr has no attribute Project, or similar

可以导入 angr 但它似乎不是实际的 angr 模块的话，看下是否是把main脚本名称命名为了angr.py

AttributeError: ‘module’ object has no attribute ‘KS_ARCH_X86’

可能是安装了keystone包，它与包冲突keystone-engine（angr 的可选依赖项）。卸载keystone。一定要安装keystone-engine的话，用pip install --no-binary keystone-engine keystone-engine，因为当前的 pip 分发已损坏。

No such file or directory: ‘libunicorn.dylib’

（替代错误消息Cannot use 'python', Python 2.4 or later is required. Note that Python 3 or later is not yet supported.：）

您需要UNICORN_QEMU_FLAGS为pip.

pthread check failed: Make sure to have the pthread libs and headers installed.

(macOS) 尝试使用 GCC 而不是 Clang。

快速开始

import angr
proj = angr.Project('这里输入二进制文件的路径')
print(proj.arch)
print(proj.entry)
print(proj.filename)
# 
# 4199648
# test.exe

上述代码将输出样本的架构，程序入口和文件名称

loader 静态加载

import angr
from cle import Loader
proj = angr.Project('test.exe')
loader:Loader = proj.loader
print(loader.shared_objects) # 输出文件的引用信息
print(hex(loader.min_addr)) # 输出文件的开头
print(hex(loader.max_addr)) # 输出文件的末尾
main_object = loader.main_object
print(main_object) # 输出文件的主对象
print(main_object.execstack) # False 判断文件是否栈可执行
print(main_object.pic) # False 判断文件是否开启了PI（position-independent）

# OrderedDict([('kernel32.dll', ), , ...('extern-address space', ), ('cle##tls', )])
# 0x400000
# 0x6b8e4897
# 
# False
# False

The factory angr的最主要的函数入口 Blocks 指定地址的基本块

import angr
proj = angr.Project('test.exe')
block = proj.factory.block(proj.entry)
print(block.pp())
#         _start:
# 4014e0  sub     esp, 0xc
# 4014e3  mov     dword ptr [0x426054], 0x0
# 4014ed  call    0x416fa0
print(block.instructions)
# 3
print(block.instruction_addrs)
# (4199648, 4199651, 4199661)
print(block.capstone)
# 0x4014e0:       sub     esp, 0xc
# 0x4014e3:       mov     dword ptr [0x426054], 0
# 0x4014ed:       call    0x416fa0
print(block.vex)
# IRSB {
#    t0:Ity_I32 t1:Ity_I32 t2:Ity_I32 t3:Ity_I32 t4:Ity_I32 t5:Ity_I32 t6:Ity_I32 t7:Ity_I32

#    00 | ------ IMark(0x4014e0, 3, 0) ------
#    01 | t2 = GET:I32(esp)
#    02 | t0 = Sub32(t2,0x0000000c)
#    03 | PUT(cc_op) = 0x00000006
#    04 | PUT(cc_dep1) = t2
#    05 | PUT(cc_dep2) = 0x0000000c
#    06 | PUT(cc_ndep) = 0x00000000
#    07 | PUT(esp) = t0
#    08 | PUT(eip) = 0x004014e3
#    09 | ------ IMark(0x4014e3, 10, 0) ------
#    10 | STle(0x00426054) = 0x00000000
#    11 | PUT(eip) = 0x004014ed
#    12 | ------ IMark(0x4014ed, 5, 0) ------
#    13 | t5 = Sub32(t0,0x00000004)
#    14 | PUT(esp) = t5
#    15 | STle(t5) = 0x004014f2
#    NEXT: PUT(eip) = 0x00416fa0; Ijk_Call
# }

instructions 指令数量，instruction_addrs 指令的地址
pp()和capstone都是输出可视化的反汇编
vex输出Python映射 VEX IRSB (Python internal address, not a program address)

States 模拟器

import angr
proj = angr.Project('test.exe')
state = proj.factory.entry_state()  # 创建一个模拟器
print(state)  # 

# 获取寄存器 
print(state.regs)
print(state.solver.eval(state.regs.eax, cast_to=int))  # 获取eax寄存器，按int输出 0

state.regs.esi = state.solver.BVV(63, 32)  # 给esi寄存器赋值
print(state.regs.esi)  # 

bv_data = state.mem[proj.entry]  # 获取内存中的数值表达式
print(state.solver.eval(bv_data.long.resolved, cast_to=bytes))  # 执行表达式 按bytes类型输出 b'\xc7\x0c\xec\x83'

bv_data = state.memory.load(proj.entry,20) # 获取内存中数值20位表达式

state.mem[0x1000].int = 31  # 给某个内存赋值
print(state.mem[0x1000].long.resolved)  # 检查输出

entry_state 进入到程序入口位置开始模拟
state.regs 获取当前寄存器
- state.regs.eax 获取当前的 eax寄存器表达式，其他的名称也是一样的
state.mem[地址] / state.memory.load(地址,长度) 获取当前内存
- state.mem[proj.entry] 获取地址是 proj.entry的内存的表达式
- solution.memory.load(addr_v_input,v_input_length) 获取内存指定长度的表达式
state.solver.eval(输入表达式,cast_to=转换成的类型)
- state.solver.eval(state.regs.eax, cast_to=int) 获取eax寄存器，按int输出 0
- state.solver.eval(bv_data.long.resolved, cast_to=bytes) 获取内存中的数值按long解析，并按bytes类型输出 b’\xc7\x0c\xec\x83’
state.solver.BVV(数值,位数)
- state.solver.BVV(63, 32) 创建一个32位表达式，数值是63

Simulation Managers

import angr
proj = angr.Project('test.exe')
state = proj.factory.entry_state()  # 创建一个模拟器
print(state)  # 
simulation = proj.factory.simulation_manager(state) # 创建一个持久化的状态
print(simulation) # 
print(simulation.active) # 获取当前运行的位置（EIP/RIP） []

# 向下运行一步
simulation.step()
print(simulation.active) # 获取当前运行的位置（EIP/RIP） []
print(simulation.active[0].regs.eip) # 输出simulation位置的（进行了符号执行模拟的） 
print(state.regs.eip) # 输出state位置的（未进行符号执行模拟的）

通过proj.factory.entry_state创建一个符号执行模拟器
通过proj.factory.simulation_manager将模拟器变成一个模拟执行
- simulation.active 返回数组，输出当前执行的位置，等价于 EIP/RIP
- simulation.step() 向下运行一条指令

Analyses 静态分析信息

输出CFG图（执行流程图）

import angr
proj = angr.Project('test.exe', auto_load_libs=False)
cfg = proj.analyses.CFG()  # 创建一个CFG图
print(cfg.graph) # DiGraph with 8292 nodes and 14039 edges
print(cfg.graph.nodes())  # 获取图中所有的节点
entry_node = cfg.get_any_node(proj.entry)  # 获取图中入口位置的节点
print(list(cfg.graph.successors(entry_node)))  # 输出该节点对应的子CFG图 [, ]

注意此处如果使用FastCFG可能会报错，而应该直接使用CFG

实战演示

测试题目：[NPUCTF2020]EzObfus-Chapter2

int __cdecl main(int argc, const char **argv, const char **envp)
{
  int j; // [esp+18h] [ebp-10h]
  int i; // [esp+1Ch] [ebp-Ch]

  sub_416F80();
  f_printf("Give Me Your Flag:\n");
  f_scanf("%s", v_input); // .bss:00426020
  if ( strlen(v_input) == 22 )
  {
    f_check();
    for ( i = 1; i <= 21; ++i )
    {
      v_input[i] += (g_map[i % 6] >> 6) ^ (16 * g_map[(i - 1) % 6]);
      v_input[i] = ((int)(unsigned __int8)v_input[i] >> 3) | (32 * v_input[i]);
    }
    for ( j = 0; j <= 21; ++j )
    {
      if ( v_input[j] != g_key[j] )
        goto LABEL_2;
    }
    f_printf("Good Job!\n"); // 0x00416609
    return 0;
  }
  else
  {
LABEL_2:
    f_printf("Error!\n"); // 0x004165EA
    return 0;
  }
}

要求用户输入22个字符，并进入f_check判断，成功后返回Good Job（0x00416609），否则返回Error （0x004165EA）

根据题意，我们在call f_check函数处0x004164F8设置入口，并且将input22字节的data 存入到 v_input和v_result中，得到exp

import sys
import angr
proj = angr.Project('EzObfus-Chapter2.exe', auto_load_libs=False)
addr_v_input = 0x00426020  # .bss:00426020
addr_v_result = 0x0042612C
v_input_length = 22
addr_f_check = 0x004164F8

entry_f_check = proj.factory.blank_state(
    addr=addr_f_check)  # `f_check` method entry
simulate_input = entry_f_check.posix.get_fd(0)  # 创建一个未知量
input_data, data_size = simulate_input.read_data(v_input_length)
entry_f_check.memory.store(addr_v_input, input_data)  # 将其存入v_input的地址中
entry_f_check.memory.store(addr_v_result, input_data)  # 将其存入v_input的地址中

simulation = proj.factory.simulation_manager(entry_f_check)
addr_success = 0x00416609  # f_printf("Good Job!\n"); // 0x00416609
addr_fail = 0x004165EA  # f_printf("Error!\n"); // 0x004165EA
simulation.explore(find=addr_success, avoid=addr_fail)
if not simulation.found:
    print('fail to solve')
    sys.exit(-1)
print(simulation.found)
solution = simulation.found[0]

# exp_v_input = solution.mem[addr_v_input]
# print(entry_f_check.solver.eval(exp_v_input.long.resolved, cast_to=bytes))
exp_v_input = solution.memory.load(addr_v_result, v_input_length)
result = solution.solver.eval(exp_v_input, cast_to=bytes)
print(result) # b'npuctf{WDNMD_LJ_OBFU!}'

其他文档

CTF逆向-常用的逆向工具提取码：pnbt
常见用法
- 使用python快捷处理加密算法（RSA、AES、DES、3DES、XXTEA、blowfish）
- Python使用struct库的用法
B站教程中国某省队CTF集训(逆向工程部分)
- 中国某省队CTF集训(逆向工程部分)(已授权)(一)
- 基础加密方式例如 XXTEA、Base64换表
- Python库 Z3 方程式、不定式等的 约束求解
- 基础的假跳转花指令(脏字节)
- 非自然程序流程
  - 扁平化程序控制流
  - OLLVM程序流程（虚拟机壳）很难一般不考
  - ida里面按X键跟踪，寻找所有Ty为w的引用（即类型是写入的），通常就是关键位置
- 中国某省队CTF集训(逆向工程部分)(已授权)(二)
- ollydb动调去壳，upx为例子
- python的逆向和自定义虚拟指令
  - 使用pycdc 提取码：dorr 解密python编译的exe或者pyc
  - 逐条去解析用py字典手动实现的指令调用
  - C++编译的程序的逆向
- 中国某省队CTF集训(逆向工程部分)(已授权)(三)
  - 简单模运算加密
  - base58 寻找一下特别大的数，这种数通常是算法的标识，或者ida7.7版本以上自带的find crypt插件ctrl+alt+f
  - 常见的关键位置是有新的内存分配的地方通常是关键地方，或者函数中间突然return的地方也是
  - 迷宫题注意绘制出来就好
  - 动调题
    - 注意观察会执行的反调试分支，例如出现int 3，需要跳过去
基本知识
- 大小端序

更多CTF逆向题通用性做法和常用工具下载参考该博文内容：CTF逆向Reverse题的玩法

angr用法解析和常见用法实战

发表评论

评论列表（0条）