来源:

https://tttang.com/archive/1595/

https://paper.seebug.org/1732/

这篇文章主要目的是梳理afl的关键机制,即编译和fuzz主要逻辑的梳理。

编译过程

afl-gcc

我们主要用afl-gcc对源码进行编译和插桩。

afl-gcc实际上是对gcc命令的封装。

这里使用了一个关键函数find_as,用于寻找afl-as的路径。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// afl-gcc.c: 310
int main(int argc, char** argv) {

...

find_as(argv[0]);

edit_params(argc, argv);

execvp(cc_params[0], (char**)cc_params);

FATAL("Oops, failed to execute '%s' - check your PATH", cc_params[0]);

return 0;

}

在命令的转换过程中,加入了-B选项,用于设置编译器,即设置我们的汇编器。

afl-as

afl-as也是对as命令的封装

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// afl-as.c: 477
int main(int argc, char** argv) {

...

edit_params(argc, argv);

...

if (!just_version) add_instrumentation();

if (!(pid = fork())) {

execvp(as_params[0], (char**)as_params);
FATAL("Oops, failed to execute '%s' - check your PATH", as_params[0]);

}

...

if (waitpid(pid, &status, 0) <= 0) PFATAL("waitpid() failed");

...

}

add_instrumentation中,对汇编代码进行插桩。插桩的具体过程就不具体分析了,总的来说是在text段中,跳转的时候,进行记录插桩,最后再结尾处插入fork_server的汇编代码。

afl-forkserver

看这里的话可以先看。

afl-fuzz

afl工作流如下

afl-fuzz-flow.png

main

首先是参数解析,将种子文件夹,输出文件夹等参数保存。

然后进行一些系统检查和环境变量设置。

setup_...函数开始,进入正篇。

来源: https://tttang.com/archive/1686/

  • 调用setup_post函数:如果指定了环境变量AFL_POST_LIBRARY,则会从指定的动态链接库so中加载函数afl_postprocess并将函数指针存储到post_handler当中,每次在运行样例前都会尝试调用该函数。这样做的内涵是提供一个接口来让用户hook模糊测试,在模糊测试过程中执行自定义的功能代码。
  • setup_shm:初始化样例路径覆盖状态变量virgin_bits、超时样例路径覆盖状态变量virgin_tmout、崩溃样例路径覆盖状态变量virgin_crash,用于后续存储样例覆盖目标程序运行路径的状态;使用SYSTEM V申请共享内存trace_bits(详情可以看《进程共享内存技术》),用于后续存储每次样例运行所覆盖的路径。
  • init_count_class16:初始化count_class_lookup16数组,该数组的作用是帮助快速归类统计路径覆盖的数量。
  • setup_dirs_fds:创建所有的输出目录,打开部分全局的文件句柄。创建输出目录queuecrasheshangs等,打开文件句柄dev_null_fddev_urandom_fd以及plot_file等。
  • read_testcases:逐个读取种子目录下的输入文件列表,并调用add_to_queue函数将相关信息(文件名称、大小等)存入到全局的种子队列queue当中,作为后续模糊测试的种子来源。单个种子信息保存在结构体queue_entry当中,形成单链表。
  • load_auto:尝试在输入目录下寻找自动生成的字典文件,调用maybe_add_auto将相应的字典加入到全局变量a_extras中,用于后续字典模式的变异当中。
  • pivot_inputs:根据相应的种子文件路径在输出目录下创建链接或拷贝至该目录下,形成orignal文件,文件命名的规则是%s/queue/id:%06u,orig:%s", out_dir, id, use_name,并更新至对应的种子信息结构体queue_entry中。
  • load_extras:如果指定了-x参数(字典模式),加载对应的字典到全局变量extras当中,用于后续字典模式的变异当中。
  • find_timeout:如果指定了resuming_fuzz即从输出目录当中恢复模糊测试状态,会从之前的模糊测试状态fuzzer_stats文件中计算中timeout值,保存在exec_tmout中。
  • detect_file_args:检测输入的命令行中是否包含@@参数,如果包含的话需要将@@替换成目录文件"%s/.cur_input", out_dir,使得模糊测试目标程序的命令完整;同时将目录文件"%s/.cur_input"路径保存在out_file当中,后续变异的内容保存在该文件路径中,用于运行测试目标文件。
  • setup_stdio_file:如果目标程序的输入不是来源于文件而是来源于标准输入的话,则将目录文件"%s/.cur_input"文件打开保存在out_fd文件句柄中,后续将标准输入重定向到该文件中;结合detect_file_args函数实现了将变异的内容保存在"%s/.cur_input"文件中,运行目标测试文件并进行模糊测试。
  • check_binary:对二进制进行一系列的检查,包括检查二进制是否是bash文件、是否是ELF文件、是否包含共享内存标志、是否包含插桩的标志等。
1
2
3
4
5
6
7
8
9
10
11
12
13
// afl-fuzz.c: 8070
perform_dry_run(use_argv);

cull_queue();

show_init_stats();

seek_to = find_start_position();

write_stats_file(0, 0, 0);
save_auto();

if (stop_soon) goto stop_fuzzing;

perform_dry_run

之前的种子通过read_testcases添加到了queue_entry中,作为一个种子的单链表。在这个函数中将每个种子作为输入,运行程序一次。如果有报错则直接退出,因为程序无法正常运行。同时还会运行一个重要的函数calibrate_case。这个之后会介绍

cull_queue

根据种子的运行结果进行排序。

fork_server通信

来源:https://tttang.com/archive/1707/

fork_server过程这篇文章讲的比较详细,在afl中,共享内存和管道主要用于afl-fuzz和插桩的程序运行过程中的通信。我们主要通过shm...api来实现对共享内存的请求、访问、销毁。

❗️fork_server的原理就是通过fork执行目标程序,减少execve的消耗,同时与fuzzer进行通信。

具体来说使用到了两个关键管道。st_pipectl_pipe,从名字来看就可以知道,是管理状态和管理控制指令的。

❗这里讲到afl是如何记录信息的。afl主要通过覆盖率记录运行信息,而这个覆盖率具体来说就是edge(边)覆盖率。在程序插桩的过程中会在每个基本块插入一个随机值作为唯一编号,当样本从一个基本块运行到另一个基本块的时候会根据两个块的唯一的编号形成一条边,以此形成覆盖率。

插入编号的代码如下:

1
2
3
4
5
6
7
8
9
10
// afl-as.c: 270
fprintf(outf, use_64bit ? trampoline_fmt_64 : trampoline_fmt_32,
R(MAP_SIZE));

// types.h: 82
# define R(x) (random() % (x))

// config.h: 328
#define MAP_SIZE_POW2 16
#define MAP_SIZE (1 << MAP_SIZE_POW2)

可以看到,这里使用printf模板字符串传入了一个随机值R,R是随机数对MAP_SIZE取模,所以这里有了一个问题,随机值是否会发生碰撞。

1
2
3
4
5
6
7
8
 Branch cnt | Colliding tuples | Example targets
------------+------------------+-----------------
1,000 | 0.75% | giflib, lzo
2,000 | 1.5% | zlib, tar, xz
5,000 | 3.5% | libpng, libwebp
10,000 | 7% | libxml
20,000 | 14% | sqlite
50,000 | 30% | -

可见,随着应用越来越大,分支数量变多,碰撞几率也随着变大。

信息交换

在这一节讲讲程序是如何与fuzzer交换信息的。

首先是插入程序的”桩“

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
static const u8* trampoline_fmt_64 =

"\n"
"/* --- AFL TRAMPOLINE (64-BIT) --- */\n"
"\n"
".align 4\n"
"\n"
"leaq -(128+24)(%%rsp), %%rsp\n"
"movq %%rdx, 0(%%rsp)\n"
"movq %%rcx, 8(%%rsp)\n"
"movq %%rax, 16(%%rsp)\n"
"movq $0x%08x, %%rcx\n"
"call __afl_maybe_log\n"
"movq 16(%%rsp), %%rax\n"
"movq 8(%%rsp), %%rcx\n"
"movq 0(%%rsp), %%rdx\n"
"leaq (128+24)(%%rsp), %%rsp\n"
"\n"
"/* --- END --- */\n"
"\n";

这里会开辟一个栈,然后保存rax、rcx、rdx,然后将代码块编号存入rcx中,调用__afl_maybe_log,这里代码块编号是在fprintf时通过模板%08x传入的。

再看main_payload的部分。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
static const u8* main_payload_64 = 

"\n"
"/* --- AFL MAIN PAYLOAD (64-BIT) --- */\n"
"\n"
".text\n"
".att_syntax\n"
".code64\n"
".align 8\n"
"\n"
"__afl_maybe_log:\n"
"\n"
#if defined(__OpenBSD__) || (defined(__FreeBSD__) && (__FreeBSD__ < 9))
" .byte 0x9f /* lahf */\n"
#else
" lahf\n"
#endif /* ^__OpenBSD__, etc */
" seto %al\n"
"\n"
" /* Check if SHM region is already mapped. */\n"
"\n"
" movq __afl_area_ptr(%rip), %rdx\n"
" testq %rdx, %rdx\n"
" je __afl_setup\n"
"\n"
"__afl_store:\n"
"\n"
" /* Calculate and store hit for the code location specified in rcx. */\n"
"\n"
#ifndef COVERAGE_ONLY
" xorq __afl_prev_loc(%rip), %rcx\n"
" xorq %rcx, __afl_prev_loc(%rip)\n"
" shrq $1, __afl_prev_loc(%rip)\n"
#endif /* ^!COVERAGE_ONLY */
"\n"
#ifdef SKIP_COUNTS
" orb $1, (%rdx, %rcx, 1)\n"
#else
" incb (%rdx, %rcx, 1)\n"
#endif /* ^SKIP_COUNTS */
"\n"
"__afl_return:\n"
"\n"
" addb $127, %al\n"
#if defined(__OpenBSD__) || (defined(__FreeBSD__) && (__FreeBSD__ < 9))
" .byte 0x9e /* sahf */\n"
#else
" sahf\n"
#endif /* ^__OpenBSD__, etc */
" ret\n"
"\n"
".align 8\n"
"\n"

首先调用一些指令保存寄存器。然后会检查__afl_area_ptr是否为空,这里使用的是用rip寄存器进行寻址,这是因为使用这个指针是全局变量。如果为空,认为是还没有启动,会跳转到__afl_setup,否则会记录过程信息:上条分支的编号在__afl_prev_loc中,与当前分支的编号rcx进行异或(为什么在rcx可见上一段代码),然后将__afl_prev_loc赋值为当前分支的编号(运用了异或可逆的特点),再将当前分支的编号右移一位(能够保证a->b和b->a是不一样的)。然后会执行__afl_area_ptr[_prev_val ^ _cur_val]++,达到记录路径和经过次数的效果。

启动过程

setup_shm

首先在afl-fuzz的main函数中,会初始化共享内存。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
/* Configure shared memory and virgin_bits. This is called at startup. */

EXP_ST void setup_shm(void) {

u8* shm_str;

if (!in_bitmap) memset(virgin_bits, 255, MAP_SIZE);

memset(virgin_tmout, 255, MAP_SIZE);
memset(virgin_crash, 255, MAP_SIZE);

shm_id = shmget(IPC_PRIVATE, MAP_SIZE, IPC_CREAT | IPC_EXCL | 0600);

if (shm_id < 0) PFATAL("shmget() failed");

atexit(remove_shm);

shm_str = alloc_printf("%d", shm_id);

/* If somebody is asking us to fuzz instrumented binaries in dumb mode,
we don't want them to detect instrumentation, since we won't be sending
fork server commands. This should be replaced with better auto-detection
later on, perhaps? */

if (!dumb_mode) setenv(SHM_ENV_VAR, shm_str, 1);

ck_free(shm_str);

trace_bits = shmat(shm_id, NULL, 0);

if (trace_bits == (void *)-1) PFATAL("shmat() failed");

}

首先使用shmget创建一个64kb(MAP_SIZE=64)的共享内存。得到内存id后,将其设置为环境变量。然后会对使用share memory attachshmat,将共享内存连接到当前进程的地址空间,之后就可以通过trace_bits对空间进行访问。

calibrate_case函数中,会检查fork_server是否启动,没有会调用init_forkserver函数

init_forkserver

1
2
3
4
5
6
// afl-fuzz.c: 2005
EXP_ST void init_forkserver(char** argv) {
...
if (pipe(st_pipe) || pipe(ctl_pipe)) PFATAL("pipe() failed");

forksrv_pid = fork();

首先会调用pipe函数创建st_pipectl_pipe管道。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
// afl-fuzz.c: 2020
if (!forksrv_pid) {

struct rlimit r;

/* Umpf. On OpenBSD, the default fd limit for root users is set to
soft 128. Let's try to fix that... */

if (!getrlimit(RLIMIT_NOFILE, &r) && r.rlim_cur < FORKSRV_FD + 2) {

r.rlim_cur = FORKSRV_FD + 2;
setrlimit(RLIMIT_NOFILE, &r); /* Ignore errors */

}

if (mem_limit) {

r.rlim_max = r.rlim_cur = ((rlim_t)mem_limit) << 20;

#ifdef RLIMIT_AS

setrlimit(RLIMIT_AS, &r); /* Ignore errors */

#else

/* This takes care of OpenBSD, which doesn't have RLIMIT_AS, but
according to reliable sources, RLIMIT_DATA covers anonymous
maps - so we should be getting good protection against OOM bugs. */

setrlimit(RLIMIT_DATA, &r); /* Ignore errors */

#endif /* ^RLIMIT_AS */


}

/* Dumping cores is slow and can lead to anomalies if SIGKILL is delivered
before the dump is complete. */

r.rlim_max = r.rlim_cur = 0;

setrlimit(RLIMIT_CORE, &r); /* Ignore errors */

/* Isolate the process and configure standard descriptors. If out_file is
specified, stdin is /dev/null; otherwise, out_fd is cloned instead. */

setsid();

dup2(dev_null_fd, 1);
dup2(dev_null_fd, 2);

if (out_file) {

dup2(dev_null_fd, 0);

} else {

dup2(out_fd, 0);
close(out_fd);

}

/* Set up control and status pipes, close the unneeded original fds. */

if (dup2(ctl_pipe[0], FORKSRV_FD) < 0) PFATAL("dup2() failed");
if (dup2(st_pipe[1], FORKSRV_FD + 1) < 0) PFATAL("dup2() failed");

close(ctl_pipe[0]);
close(ctl_pipe[1]);
close(st_pipe[0]);
close(st_pipe[1]);

close(out_dir_fd);
close(dev_null_fd);
close(dev_urandom_fd);
close(fileno(plot_file));

/* This should improve performance a bit, since it stops the linker from
doing extra work post-fork(). */

if (!getenv("LD_BIND_LAZY")) setenv("LD_BIND_NOW", "1", 0);

/* Set sane defaults for ASAN if nothing else specified. */

setenv("ASAN_OPTIONS", "abort_on_error=1:"
"detect_leaks=0:"
"symbolize=0:"
"allocator_may_return_null=1", 0);

/* MSAN is tricky, because it doesn't support abort_on_error=1 at this
point. So, we do this in a very hacky way. */

setenv("MSAN_OPTIONS", "exit_code=" STRINGIFY(MSAN_ERROR) ":"
"symbolize=0:"
"abort_on_error=1:"
"allocator_may_return_null=1:"
"msan_track_origins=0", 0);

execv(target_path, argv);

/* Use a distinctive bitmap signature to tell the parent about execv()
falling through. */

*(u32*)trace_bits = EXEC_FAIL_SIG;
exit(0);

}

通过前段代码可知,forkserver是通过fork创建的一个子进程,观察子进程的代码,首先修改了rlimit,即申请的资源限制。

使用setsid使自己变为一个独立进程。

进行了一些句柄的设置,主要是将控制管道的读管道复制到FORKSRV_FD,将状态管道的写管道复制到FORKSRV_FD+1

在99行,使用了execv函数,如果发生错误,这个函数会返回,所以直接将FAIL赋值给trace_bits(成功就不会执行这条了)。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
  // afl-fuzz.c: 2127
/* Close the unneeded endpoints. */

close(ctl_pipe[0]);
close(st_pipe[1]);

fsrv_ctl_fd = ctl_pipe[1];
fsrv_st_fd = st_pipe[0];

/* Wait for the fork server to come up, but don't wait too long. */

it.it_value.tv_sec = ((exec_tmout * FORK_WAIT_MULT) / 1000);
it.it_value.tv_usec = ((exec_tmout * FORK_WAIT_MULT) % 1000) * 1000;

setitimer(ITIMER_REAL, &it, NULL);

rlen = read(fsrv_st_fd, &status, 4);

it.it_value.tv_sec = 0;
it.it_value.tv_usec = 0;

setitimer(ITIMER_REAL, &it, NULL);

/* If we have a four-byte "hello" message from the server, we're all set.
Otherwise, try to figure out what went wrong. */

if (rlen == 4) {
OKF("All right - fork server is up.");
return;
}

if (child_timed_out)
FATAL("Timeout while initializing fork server (adjusting -t may help)");

if (waitpid(forksrv_pid, &status, 0) <= 0)
PFATAL("waitpid() failed");

if (WIFSIGNALED(status)) {
...
}
if (*(u32*)trace_bits == EXEC_FAIL_SIG)
...
}
FATAL("Fork server handshake failed");

将控制管道的写句柄(ctl_pipe[1])保存到全局变量fsrv_ctl_fd,将状态管道的读句柄(st_pipe[0])保存到全局变量fsrv_st_fd中。

然后调用setitimer函数设置超时时限,调用read(fsrv_st_fd, &status, 4)函数等待状态管道传回数据(前面子进程起来以后,会传回4字节的hello消息),接收到该信息后(rlen == 4)表明forkserver已经正常启动了;否则说明forkserver启动失败,通过子进程返回的消息以及看trace_bits是否是EXEC_FAIL_SIG去看失败的原因。

afl-as

fuzzer的初始化看完了,再看看目标程序中,是如何进行setup的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
  // afl-as.h: 442
"__afl_setup:\n"
"\n"
" /* Do not retry setup if we had previous failures. */\n"
"\n"
" cmpb $0, __afl_setup_failure(%rip)\n"
" jne __afl_return\n"
"\n"
" /* Check out if we have a global pointer on file. */\n"
"\n"
#ifndef __APPLE__
" movq __afl_global_area_ptr@GOTPCREL(%rip), %rdx\n"
" movq (%rdx), %rdx\n"
#else
" movq __afl_global_area_ptr(%rip), %rdx\n"
#endif /* !^__APPLE__ */
" testq %rdx, %rdx\n"
" je __afl_setup_first\n"
"\n"
" movq %rdx, __afl_area_ptr(%rip)\n"
" jmp __afl_store\n"
"\n"

首先会判断是否有失败过,有就退出,然后观察__afl_global_area_ptr是否有值,没有会跳到first_setup函数。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
 // afl-as.h: 463
"__afl_setup_first:\n"
"\n"
" /* Save everything that is not yet saved and that may be touched by\n"
" getenv() and several other libcalls we'll be relying on. */\n"
"\n"
" leaq -352(%rsp), %rsp\n"
"\n"
" movq %rax, 0(%rsp)\n"
" movq %rcx, 8(%rsp)\n"
" movq %rdi, 16(%rsp)\n"
" movq %rsi, 32(%rsp)\n"
" movq %r8, 40(%rsp)\n"
" movq %r9, 48(%rsp)\n"
" movq %r10, 56(%rsp)\n"
" movq %r11, 64(%rsp)\n"
"\n"
" movq %xmm0, 96(%rsp)\n"
" movq %xmm1, 112(%rsp)\n"
" movq %xmm2, 128(%rsp)\n"
" movq %xmm3, 144(%rsp)\n"
" movq %xmm4, 160(%rsp)\n"
" movq %xmm5, 176(%rsp)\n"
" movq %xmm6, 192(%rsp)\n"
" movq %xmm7, 208(%rsp)\n"
" movq %xmm8, 224(%rsp)\n"
" movq %xmm9, 240(%rsp)\n"
" movq %xmm10, 256(%rsp)\n"
" movq %xmm11, 272(%rsp)\n"
" movq %xmm12, 288(%rsp)\n"
" movq %xmm13, 304(%rsp)\n"
" movq %xmm14, 320(%rsp)\n"
" movq %xmm15, 336(%rsp)\n"
"\n"
// afl-as.h: 496
" /* Map SHM, jumping to __afl_setup_abort if something goes wrong. */\n"
"\n"
" /* The 64-bit ABI requires 16-byte stack alignment. We'll keep the\n"
" original stack ptr in the callee-saved r12. */\n"
"\n"
" pushq %r12\n"
" movq %rsp, %r12\n"
" subq $16, %rsp\n"
" andq $0xfffffffffffffff0, %rsp\n"
"\n"
" leaq .AFL_SHM_ENV(%rip), %rdi\n"
CALL_L64("getenv")
"\n"
" testq %rax, %rax\n"
" je __afl_setup_abort\n"
"\n"
" movq %rax, %rdi\n"
CALL_L64("atoi")
"\n"
" xorq %rdx, %rdx /* shmat flags */\n"
" xorq %rsi, %rsi /* requested addr */\n"
" movq %rax, %rdi /* SHM ID */\n"
CALL_L64("shmat")
"\n"
" cmpq $-1, %rax\n"
" je __afl_setup_abort\n"
"\n"
" /* Store the address of the SHM region. */\n"
"\n"
" movq %rax, %rdx\n"
" movq %rax, __afl_area_ptr(%rip)\n"
"\n"
#ifdef __APPLE__
" movq %rax, __afl_global_area_ptr(%rip)\n"
#else
" movq __afl_global_area_ptr@GOTPCREL(%rip), %rdx\n"
" movq %rax, (%rdx)\n"
#endif /* ^__APPLE__ */
" movq %rax, %rdx\n"
"\n"

__afl_setup_first中,先将寄存器的值放到了栈中保存,从环境变量中获取到共享内存的id,并且映射到进程空间,再把指针赋值给__afl_area_ptr中。

获取了共享内存后,就方便进行分支信息记录。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
// afl-as.h: 536
"__afl_forkserver:\n"
"\n"
" /* Enter the fork server mode to avoid the overhead of execve() calls. We\n"
" push rdx (area ptr) twice to keep stack alignment neat. */\n"
"\n"
" pushq %rdx\n"
" pushq %rdx\n"
"\n"
" /* Phone home and tell the parent that we're OK. (Note that signals with\n"
" no SA_RESTART will mess it up). If this fails, assume that the fd is\n"
" closed because we were execve()d from an instrumented binary, or because\n"
" the parent doesn't want to use the fork server. */\n"
"\n"
" movq $4, %rdx /* length */\n"
" leaq __afl_temp(%rip), %rsi /* data */\n"
" movq $" STRINGIFY((FORKSRV_FD + 1)) ", %rdi /* file desc */\n"
CALL_L64("write")
"\n"
" cmpq $4, %rax\n"
" jne __afl_fork_resume\n"
"\n"

调用write向状态管道写入4字节信息,回想父进程的设置定时器,等待管道中读到4字节信息,这个是对应的。

初始化完成后,进入执行循环。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// afl-as.h: 557
"__afl_fork_wait_loop:\n"
"\n"
" /* Wait for parent by reading from the pipe. Abort if read fails. */\n"
"\n"
" movq $4, %rdx /* length */\n"
" leaq __afl_temp(%rip), %rsi /* data */\n"
" movq $" STRINGIFY(FORKSRV_FD) ", %rdi /* file desc */\n"
CALL_L64("read")
" cmpq $4, %rax\n"
" jne __afl_die\n"
"\n"
" /* Once woken up, create a clone of our process. This is an excellent use\n"
" case for syscall(__NR_clone, 0, CLONE_PARENT), but glibc boneheadedly\n"
" caches getpid() results and offers no way to update the value, breaking\n"
" abort(), raise(), and a bunch of other things :-( */\n"
"\n"
CALL_L64("fork")
" cmpq $0, %rax\n"
// fork failed
" jl __afl_die\n"
// child thread
" je __afl_fork_resume\n"

先是等待fork_server发号施令,开始的话,执行fork

还是先分析子进程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
// afl-as.h:606
"__afl_fork_resume:\n"
"\n"
" /* In child process: close fds, resume execution. */\n"
"\n"
" movq $" STRINGIFY(FORKSRV_FD) ", %rdi\n"
CALL_L64("close")
"\n"
" movq $" STRINGIFY((FORKSRV_FD + 1)) ", %rdi\n"
CALL_L64("close")
"\n"
" popq %rdx\n"
" popq %rdx\n"
"\n"
" movq %r12, %rsp\n"
" popq %r12\n"
"\n"
" movq 0(%rsp), %rax\n"
" movq 8(%rsp), %rcx\n"
" movq 16(%rsp), %rdi\n"
" movq 32(%rsp), %rsi\n"
" movq 40(%rsp), %r8\n"
" movq 48(%rsp), %r9\n"
" movq 56(%rsp), %r10\n"
" movq 64(%rsp), %r11\n"
"\n"
" movq 96(%rsp), %xmm0\n"
" movq 112(%rsp), %xmm1\n"
" movq 128(%rsp), %xmm2\n"
" movq 144(%rsp), %xmm3\n"
" movq 160(%rsp), %xmm4\n"
" movq 176(%rsp), %xmm5\n"
" movq 192(%rsp), %xmm6\n"
" movq 208(%rsp), %xmm7\n"
" movq 224(%rsp), %xmm8\n"
" movq 240(%rsp), %xmm9\n"
" movq 256(%rsp), %xmm10\n"
" movq 272(%rsp), %xmm11\n"
" movq 288(%rsp), %xmm12\n"
" movq 304(%rsp), %xmm13\n"
" movq 320(%rsp), %xmm14\n"
" movq 336(%rsp), %xmm15\n"
"\n"
" leaq 352(%rsp), %rsp\n"
"\n"
" jmp __afl_store\n"
"\n"

先是恢复寄存器,然后跳转到__afl_store将分支记录存储。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// afl-as.h:580
// parent thread continue
" /* In parent process: write PID to pipe, then wait for child. */\n"
"\n"
" movl %eax, __afl_fork_pid(%rip)\n"
"\n"
" movq $4, %rdx /* length */\n"
" leaq __afl_fork_pid(%rip), %rsi /* data */\n"
" movq $" STRINGIFY((FORKSRV_FD + 1)) ", %rdi /* file desc */\n"
CALL_L64("write")
"\n"
" movq $0, %rdx /* no flags */\n"
" leaq __afl_temp(%rip), %rsi /* status */\n"
" movq __afl_fork_pid(%rip), %rdi /* PID */\n"
CALL_L64("waitpid")
" cmpq $0, %rax\n"
" jle __afl_die\n"
"\n"
" /* Relay wait status to pipe, then loop back. */\n"
"\n"
" movq $4, %rdx /* length */\n"
" leaq __afl_temp(%rip), %rsi /* data */\n"
" movq $" STRINGIFY((FORKSRV_FD + 1)) ", %rdi /* file desc */\n"
CALL_L64("write")
"\n"
" jmp __afl_fork_wait_loop\n"
"\n"

在父进程中,将子进程运行的pid写给server,然后等待子进程结束,将状态码写给fuzzer。

最后跳回到loop等待下一轮fuzz。

总结

总的看下来,是不是有点乱,让我们重新理一理。

简单来说,我们这一章介绍了forkserver和app信息交流的方式

在创建forkserver的过程中,afl-fuzz先fork了一个子进程,让子进程去执行app,执行使用的是execv函数,这个函数会执行app,并且是覆盖当前程序的形式,所以进程空间等信息是相同的,进入app后,会根据我们上面分析的过程,先通过管道告诉afl-fuzz:”我准备好了“,然后进入loop等待命令,如果afl-fuzz向控制管道中写入命令,app就会fork出子进程进行fuzz,然后继续进入loop。

afl信息监控

来源:https://tttang.com/archive/1753/

在这一章里,我们会讲解afl如何运行目标程序,并且获取运行状态的。

运行目标程序

在afl-fuzz中,我们的fuzz过程主要在一个while死循环中进行。

1
2
3
4
5
// afl-fuzz.c:8091
while (1) {
...
skipped_fuzz = fuzz_one(use_argv);
...

而在fuzz_one中会调用common_fuzz_stuff函数。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
// afl-fuzz.c: 4646
/* Write a modified test case, run program, process results. Handle
error conditions, returning 1 if it's time to bail out. This is
a helper function for fuzz_one(). */

EXP_ST u8 common_fuzz_stuff(char** argv, u8* out_buf, u32 len) {

u8 fault;

if (post_handler) {

out_buf = post_handler(out_buf, &len);
if (!out_buf || !len) return 0;

}

write_to_testcase(out_buf, len);

fault = run_target(argv, exec_tmout);

if (stop_soon) return 1;

if (fault == FAULT_TMOUT) {

if (subseq_tmouts++ > TMOUT_LIMIT) {
cur_skipped_paths++;
return 1;
}

} else subseq_tmouts = 0;

/* Users can hit us with SIGUSR1 to request the current input
to be abandoned. */

if (skip_requested) {

skip_requested = 0;
cur_skipped_paths++;
return 1;

}

/* This handles FAULT_ERROR for us: */

queued_discovered += save_if_interesting(argv, out_buf, len, fault);

if (!(stage_cur % stats_update_freq) || stage_cur + 1 == stage_max)
show_stats();

return 0;

}

post_handler是用户的hook函数。

write_to_testcase是保存变异的输入到指定位置。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// afl-fuzz.c: 2504
/* Write modified data to file for testing. If out_file is set, the old file
is unlinked and a new one is created. Otherwise, out_fd is rewound and
truncated. */

static void write_to_testcase(void* mem, u32 len) {

s32 fd = out_fd;

if (out_file) {

unlink(out_file); /* Ignore errors. */

fd = open(out_file, O_WRONLY | O_CREAT | O_EXCL, 0600);

if (fd < 0) PFATAL("Unable to create '%s'", out_file);

} else lseek(fd, 0, SEEK_SET);

ck_write(fd, mem, len, out_file);

if (!out_file) {

if (ftruncate(fd, len)) PFATAL("ftruncate() failed");
lseek(fd, 0, SEEK_SET);

} else close(fd);

}

out_file是表示目标程序是否从文件中获取数据,和标准输入有不同的处理方式。

保存好后直接使用run_target运行目标程序。

1
2
3
4
5
6
7
8
9
10
11
12
13
// afl-fuzz.c: 2287
/* Execute target application, monitoring for timeouts. Return status
information. The called program will update trace_bits[]. */

static u8 run_target(char** argv, u32 timeout) {

...

/* After this memset, trace_bits[] are effectively volatile, so we
must prevent any earlier operations from venturing into that
territory. */

memset(trace_bits, 0, MAP_SIZE);

首先清除共享内存信息。

然后会判断是否有forkserver或者不使用插桩,这个是直接运行程序,我们不关注。

1
2
// afl-fuzz.c: 2313
if (dumb_mode == 1 || no_forkserver) {

我们直接看对应插桩程序的执行代码。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// afl-fuzz.c: 2390
} else {

s32 res;

/* In non-dumb mode, we have the fork server up and running, so simply
tell it to have at it, and then read back PID. */

if ((res = write(fsrv_ctl_fd, &prev_timed_out, 4)) != 4) {

if (stop_soon) return 0;
RPFATAL(res, "Unable to request new process from fork server (OOM?)");

}

if ((res = read(fsrv_st_fd, &child_pid, 4)) != 4) {

if (stop_soon) return 0;
RPFATAL(res, "Unable to request new process from fork server (OOM?)");

}

if (child_pid <= 0) FATAL("Fork server is misbehaving (OOM?)");

}

首先向控制管道中写入了上一次运行是否超时的信息。

目标程序中的server会接收到信息,进行一次fuzz

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
// afl-fuzz.c: 2415
/* Configure timeout, as requested by user, then wait for child to terminate. */

it.it_value.tv_sec = (timeout / 1000);
it.it_value.tv_usec = (timeout % 1000) * 1000;

setitimer(ITIMER_REAL, &it, NULL);

/* The SIGALRM handler simply kills the child_pid and sets child_timed_out. */

if (dumb_mode == 1 || no_forkserver) {

if (waitpid(child_pid, &status, 0) <= 0) PFATAL("waitpid() failed");

} else {

s32 res;

if ((res = read(fsrv_st_fd, &status, 4)) != 4) {

if (stop_soon) return 0;
RPFATAL(res, "Unable to communicate with fork server (OOM?)");

}

}

if (!WIFSTOPPED(status)) child_pid = 0;

getitimer(ITIMER_REAL, &it);
exec_ms = (u64) timeout - (it.it_value.tv_sec * 1000 +
it.it_value.tv_usec / 1000);

it.it_value.tv_sec = 0;
it.it_value.tv_usec = 0;

setitimer(ITIMER_REAL, &it, NULL);

total_execs++;

这里通过itimer进行了超时处理,并且进行记录。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
// afl-fuzz.c: 2454
/* Any subsequent operations on trace_bits must not be moved by the
compiler below this point. Past this location, trace_bits[] behave
very normally and do not have to be treated as volatile. */

MEM_BARRIER();

tb4 = *(u32*)trace_bits;

#ifdef WORD_SIZE_64
classify_counts((u64*)trace_bits);
#else
classify_counts((u32*)trace_bits);
#endif /* ^WORD_SIZE_64 */

prev_timed_out = child_timed_out;

/* Report outcome to caller. */

if (WIFSIGNALED(status) && !stop_soon) {

kill_signal = WTERMSIG(status);

if (child_timed_out && kill_signal == SIGKILL) return FAULT_TMOUT;

return FAULT_CRASH;

}

/* A somewhat nasty hack for MSAN, which doesn't support abort_on_error and
must use a special exit code. */

if (uses_asan && WEXITSTATUS(status) == MSAN_ERROR) {
kill_signal = 0;
return FAULT_CRASH;
}

if ((dumb_mode == 1 || no_forkserver) && tb4 == EXEC_FAIL_SIG)
return FAULT_ERROR;

/* It makes sense to account for the slowest units only if the testcase was run
under the user defined timeout. */
if (!(timeout > exec_tmout) && (slowest_exec_ms < exec_ms)) {
slowest_exec_ms = exec_ms;
}

return FAULT_NONE;

最后一部分会对trace_bits进行一些操作。不过这个函数在下一部分再讲。

有效性分析

样本的有效性是指目标程序使用该样本运行后,该样本是否增加了目标程序的覆盖率、是否导致目标程序崩溃以及是否导致目标程序崩溃。在目标程序完成运行后,覆盖率信息都记录在共享内存trace_bits中。会将记录的覆盖率信息进行的简单的处理后,调用save_if_interesting函数来看此次运行的样例是否有效。

save_if_interesting函数中,会调用has_new_bits函数查看是否有新增的路径,如果有的话将该样本添加到种子队列中;然后调用calibrate_case函数来校正样例的运行行为;对于新增到种子队列中的样例会调用update_bitmap_score函数根据样例运行的状态对总的种子情况进行新的排序,以决定下次运行时挑选对输入种子;最后根据目标程序的运行结果(超时、崩溃)保存到对应的路径当中。

变量分析

  • trace_bits共享内存,fuzzer和目标程序都会用到,用于记录分支覆盖情况。
  • virgin_bits用来比对路径是否有新增,virgin_tmout是超时样例,virgin_crash是崩溃样例,三个运行完都是trace_bits的取反
  • top_rated是针对每条边挑选当前最合适的种子。

路径运行次数随着运行数值的增加影响越来越小,所以afl规定了一个范围的归约。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// afl-fuzz.c: 1139
/* Destructively classify execution counts in a trace. This is used as a
preprocessing step for any newly acquired traces. Called on every exec,
must be fast. */

static const u8 count_class_lookup8[256] = {

[0] = 0,
[1] = 1,
[2] = 2,
[3] = 4,
[4 ... 7] = 8,
[8 ... 15] = 16,
[16 ... 31] = 32,
[32 ... 127] = 64,
[128 ... 255] = 128

};

classify_counts是统计运行次数的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// afl-fuzz.c: 1175
static inline void classify_counts(u64* mem) {

u32 i = MAP_SIZE >> 3;

while (i--) {

/* Optimize for sparse bitmaps. */

if (unlikely(*mem)) {

u16* mem16 = (u16*)mem;

mem16[0] = count_class_lookup16[mem16[0]];
mem16[1] = count_class_lookup16[mem16[1]];
mem16[2] = count_class_lookup16[mem16[2]];
mem16[3] = count_class_lookup16[mem16[3]];

}

mem++;

}

}

在前文中,提到classify_counts((u64*)trace_bits);,可见这个函数将数组中的元素进行归约。这里使用unlikely宏对mem进行了判断,这个常用于优化编译器,使得处理器的流水线作业更加顺畅,意思是mem的值更有可能为0,让编译器优化汇编语言(因为mem长度挺大的,这个循环要做很多次,不能总有分支跳转);而更有可能为0的原因是我们在做fuzz的时候,大部分分支是碰不到的,所以更容易为0。

样本有趣

让我们回到common_fuzz_stuff函数中来,在run_target函数后,是这些代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// afl-fuzz.c: 4665
if (stop_soon) return 1;

if (fault == FAULT_TMOUT) {

if (subseq_tmouts++ > TMOUT_LIMIT) {
cur_skipped_paths++;
return 1;
}

} else subseq_tmouts = 0;

/* Users can hit us with SIGUSR1 to request the current input
to be abandoned. */

if (skip_requested) {

skip_requested = 0;
cur_skipped_paths++;
return 1;

}

/* This handles FAULT_ERROR for us: */

queued_discovered += save_if_interesting(argv, out_buf, len, fault);

if (!(stage_cur % stats_update_freq) || stage_cur + 1 == stage_max)
show_stats();

return 0;

会检查停止标志位、错误,和跳过的请求。然后会将save_if_interesting函数的返回值加到queued_discoverd中。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
// afl-fuzz.c: 3159
/* Check if the result of an execve() during routine fuzzing is interesting,
save or queue the input test case for further analysis if so. Returns 1 if
entry is saved, 0 otherwise. */

static u8 save_if_interesting(char** argv, void* mem, u32 len, u8 fault) {

u8 *fn = "";
u8 hnb;
s32 fd;
u8 keeping = 0, res;

if (fault == crash_mode) {

/* Keep only if there are new bits in the map, add to queue for
future fuzzing, etc. */

if (!(hnb = has_new_bits(virgin_bits))) {
if (crash_mode) total_crashes++;
return 0;
}

#ifndef SIMPLE_FILES

fn = alloc_printf("%s/queue/id:%06u,%s", out_dir, queued_paths,
describe_op(hnb));

#else

fn = alloc_printf("%s/queue/id_%06u", out_dir, queued_paths);

#endif /* ^!SIMPLE_FILES */

add_to_queue(fn, len, 0);

if (hnb == 2) {
queue_top->has_new_cov = 1;
queued_with_cov++;
}

queue_top->exec_cksum = hash32(trace_bits, MAP_SIZE, HASH_CONST);

crash_mode通常都是FAULT_NONE,所以看看这个分支做了什么事情。

首先调用has_new_bits来与总覆盖率的状态virgin_bits比较,查看这次运行是否产生了新的路径,如果有,就调用add_to_queue函数将其作为新的种子,并且通过hash函数计算覆盖率的哈希,否则返回。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
// afl-fuzz.c: 899
/* Check if the current execution path brings anything new to the table.
Update virgin bits to reflect the finds. Returns 1 if the only change is
the hit-count for a particular tuple; 2 if there are new tuples seen.
Updates the map, so subsequent calls will always return 0.

This function is called after every exec() on a fairly large buffer, so
it needs to be fast. We do this in 32-bit and 64-bit flavors. */

static inline u8 has_new_bits(u8* virgin_map) {

#ifdef WORD_SIZE_64

u64* current = (u64*)trace_bits;
u64* virgin = (u64*)virgin_map;

u32 i = (MAP_SIZE >> 3);

#else

u32* current = (u32*)trace_bits;
u32* virgin = (u32*)virgin_map;

u32 i = (MAP_SIZE >> 2);

#endif /* ^WORD_SIZE_64 */

u8 ret = 0;

while (i--) {

/* Optimize for (*current & *virgin) == 0 - i.e., no bits in current bitmap
that have not been already cleared from the virgin map - since this will
almost always be the case. */

if (unlikely(*current) && unlikely(*current & *virgin)) {

if (likely(ret < 2)) {

u8* cur = (u8*)current;
u8* vir = (u8*)virgin;

/* Looks like we have not found any new bytes yet; see if any non-zero
bytes in current[] are pristine in virgin[]. */

#ifdef WORD_SIZE_64

if ((cur[0] && vir[0] == 0xff) || (cur[1] && vir[1] == 0xff) ||
(cur[2] && vir[2] == 0xff) || (cur[3] && vir[3] == 0xff) ||
(cur[4] && vir[4] == 0xff) || (cur[5] && vir[5] == 0xff) ||
(cur[6] && vir[6] == 0xff) || (cur[7] && vir[7] == 0xff)) ret = 2;
else ret = 1;

#else

if ((cur[0] && vir[0] == 0xff) || (cur[1] && vir[1] == 0xff) ||
(cur[2] && vir[2] == 0xff) || (cur[3] && vir[3] == 0xff)) ret = 2;
else ret = 1;

#endif /* ^WORD_SIZE_64 */

}

*virgin &= ~*current;

}

current++;
virgin++;

}

if (ret && virgin_map == virgin_bits) bitmap_changed = 1;

return ret;

}

首先回顾一下,virgin_bits一开始所有位都是1,而trace_bits一开始都是0,virgin_bits是用来指示哪些分支还没有被覆盖到的,在函数has_new_bits中,对数组会进行遍历。

通常情况下*current*current&*virgin都是0,因为覆盖的总是一小部分,而且1和0做位与也是0。然后每个字节判断,如果出现了新的字节,ret赋值为2(这里如果有新的路径,则vir[i]为0xff,因为没被发现过,同时cur[i]也为0xff,所以与的结果为0xff),反之则ret为1。

在一个循环末尾,更新virgin数组,将新发现的路径标志位置为0。

回到save_if_interesting函数中,判断完是否有新的路径,会调用calibrate_case对样例运行的状态进行校正,实现保证它运行的状态是确定的;然后将样例的数据保存到对应的文件路径中。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// afl-fuzz.c: 3203
/* Try to calibrate inline; this also calls update_bitmap_score() when
successful. */

res = calibrate_case(argv, queue_top, mem, queue_cycle - 1, 0);

if (res == FAULT_ERROR)
FATAL("Unable to execute target application");

fd = open(fn, O_WRONLY | O_CREAT | O_EXCL, 0600);
if (fd < 0) PFATAL("Unable to create '%s'", fn);
ck_write(fd, mem, len, fn);
close(fd);

keeping = 1;

}

在之前,我们简单地说过calibrate_case函数会检查forkserver状态并且启动,这里来仔细剖析。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
// afl-fuzz.c: 2567
/* Calibrate a new test case. This is done when processing the input directory
to warn about flaky or otherwise problematic test cases early on; and when
new paths are discovered to detect variable behavior and so on. */

static u8 calibrate_case(char** argv, struct queue_entry* q, u8* use_mem,
u32 handicap, u8 from_queue) {

...

q->cal_failed++;

stage_name = "calibration";
stage_max = fast_cal ? 3 : CAL_CYCLES;

/* Make sure the forkserver is up before we do anything, and let's not
count its spin-up time toward binary calibration. */

if (dumb_mode != 1 && !no_forkserver && !forksrv_pid)
init_forkserver(argv);

if (q->exec_cksum) {

memcpy(first_trace, trace_bits, MAP_SIZE);
hnb = has_new_bits(virgin_bits);
if (hnb > new_bits) new_bits = hnb;

}

start_us = get_cur_time_us();

for (stage_cur = 0; stage_cur < stage_max; stage_cur++) {

u32 cksum;

if (!first_run && !(stage_cur % stats_update_freq)) show_stats();

write_to_testcase(use_mem, q->len);

fault = run_target(argv, use_tmout);

/* stop_soon is set by the handler for Ctrl+C. When it's pressed,
we want to bail out quickly. */

if (stop_soon || fault != crash_mode) goto abort_calibration;

if (!dumb_mode && !stage_cur && !count_bytes(trace_bits)) {
fault = FAULT_NOINST;
goto abort_calibration;
}

cksum = hash32(trace_bits, MAP_SIZE, HASH_CONST);

if (q->exec_cksum != cksum) {

hnb = has_new_bits(virgin_bits);
if (hnb > new_bits) new_bits = hnb;

if (q->exec_cksum) {

u32 i;

for (i = 0; i < MAP_SIZE; i++) {

if (!var_bytes[i] && first_trace[i] != trace_bits[i]) {

var_bytes[i] = 1;
stage_max = CAL_CYCLES_LONG;

}

}

var_detected = 1;

} else {

q->exec_cksum = cksum;
memcpy(first_trace, trace_bits, MAP_SIZE);

}

}

}

跳过之前讲的初始化forkserver的部分。检查q->exec_cksum是否有值,这个值是之前计算的路径的哈希值,有则证明种子被运行过。

如果运行过,则将trace_bits的值保存到first_trace中,查看是否有新路径。

再往下,会进行几次循环,对当前的种子再次运行,看运行路径的哈希是否与之前的一样,如果不一致,则不一致的地方可以被标记为变量var_bytes,并将标志位var_detected置位,同时将运行次数增加为40次。

循环完成后,根据执行的时间等保存样例运行状态。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
// afl-fuzz.c: 2669
stop_us = get_cur_time_us();

total_cal_us += stop_us - start_us;
total_cal_cycles += stage_max;

/* OK, let's collect some stats about the performance of this test case.
This is used for fuzzing air time calculations in calculate_score(). */

q->exec_us = (stop_us - start_us) / stage_max;
q->bitmap_size = count_bytes(trace_bits);
q->handicap = handicap;
q->cal_failed = 0;

total_bitmap_size += q->bitmap_size;
total_bitmap_entries++;

update_bitmap_score(q);

/* If this case didn't result in new output from the instrumentation, tell
parent. This is a non-critical problem, but something to warn the user
about. */

if (!dumb_mode && first_run && !fault && !new_bits) fault = FAULT_NOBITS;

abort_calibration:

if (new_bits == 2 && !q->has_new_cov) {
q->has_new_cov = 1;
queued_with_cov++;
}

/* Mark variable paths. */

if (var_detected) {

var_byte_count = count_bytes(var_bytes);

if (!q->var_behavior) {
mark_as_variable(q);
queued_variable++;
}

}

stage_name = old_sn;
stage_cur = old_sc;
stage_max = old_sm;

if (!first_run) show_stats();

return fault;

}

在循环执行完成后,保存一些信息,bitmap_size是种子到达路径的数量。

使用update_bitmap_score函数计算种子得分。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
// afl-fuzz.c: 1255
/* When we bump into a new path, we call this to see if the path appears
more "favorable" than any of the existing ones. The purpose of the
"favorables" is to have a minimal set of paths that trigger all the bits
seen in the bitmap so far, and focus on fuzzing them at the expense of
the rest.

The first step of the process is to maintain a list of top_rated[] entries
for every byte in the bitmap. We win that slot if there is no previous
contender, or if the contender has a more favorable speed x size factor. */

static void update_bitmap_score(struct queue_entry* q) {

u32 i;
u64 fav_factor = q->exec_us * q->len;

/* For every byte set in trace_bits[], see if there is a previous winner,
and how it compares to us. */

for (i = 0; i < MAP_SIZE; i++)

if (trace_bits[i]) {

if (top_rated[i]) {

/* Faster-executing or smaller test cases are favored. */

if (fav_factor > top_rated[i]->exec_us * top_rated[i]->len) continue;

/* Looks like we're going to win. Decrease ref count for the
previous winner, discard its trace_bits[] if necessary. */

if (!--top_rated[i]->tc_ref) {
ck_free(top_rated[i]->trace_mini);
top_rated[i]->trace_mini = 0;
}

}

/* Insert ourselves as the new winner. */

top_rated[i] = q;
q->tc_ref++;

if (!q->trace_mini) {
q->trace_mini = ck_alloc(MAP_SIZE >> 3);
minimize_bits(q->trace_mini, trace_bits);
}

score_changed = 1;

}

}

种子的得分是运行时间乘种子长度。得分越少说明消耗资源越少,价值越高。

函数主体是对trace_bits遍历,top_rated是一个大小为MAP_SIZE元素类型为种子指针的数组。那么如果在某个分支上有“winner”,会与当前进行比较,如果价值更高(分数更低),会减少”winner“的引用数,当为0时,进行释放,反之遍历下一个分支。

然后将我们这次的种子作为新的“winner”。同时将引用次数+1,并将score_changed置位,表明所有种子队列的排名已经发生了变化。而且如果这个种子第一次成为”winner“,还会创建一个trace_mini,来压缩表示种子的执行路径。

这里calibrate_case函数就结束了,我们继续回到save_if_interesting函数中来,接下来会对不同的错误进行一些处理。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
  // afl-fuzz.c: 3217
switch (fault) {

case FAULT_TMOUT:

/* Timeouts are not very interesting, but we're still obliged to keep
a handful of samples. We use the presence of new bits in the
hang-specific bitmap as a signal of uniqueness. In "dumb" mode, we
just keep everything. */

total_tmouts++;

if (unique_hangs >= KEEP_UNIQUE_HANG) return keeping;

if (!dumb_mode) {

#ifdef WORD_SIZE_64
simplify_trace((u64*)trace_bits);
#else
simplify_trace((u32*)trace_bits);
#endif /* ^WORD_SIZE_64 */

if (!has_new_bits(virgin_tmout)) return keeping;

}

unique_tmouts++;

/* Before saving, we make sure that it's a genuine hang by re-running
the target with a more generous timeout (unless the default timeout
is already generous). */

if (exec_tmout < hang_tmout) {

u8 new_fault;
write_to_testcase(mem, len);
new_fault = run_target(argv, hang_tmout);

/* A corner case that one user reported bumping into: increasing the
timeout actually uncovers a crash. Make sure we don't discard it if
so. */

if (!stop_soon && new_fault == FAULT_CRASH) goto keep_as_crash;

if (stop_soon || new_fault != FAULT_TMOUT) return keeping;

}

#ifndef SIMPLE_FILES

fn = alloc_printf("%s/hangs/id:%06llu,%s", out_dir,
unique_hangs, describe_op(0));

#else

fn = alloc_printf("%s/hangs/id_%06llu", out_dir,
unique_hangs);

#endif /* ^!SIMPLE_FILES */

unique_hangs++;

last_hang_time = get_cur_time();

break;

case FAULT_CRASH:

keep_as_crash:

/* This is handled in a manner roughly similar to timeouts,
except for slightly different limits and no need to re-run test
cases. */

total_crashes++;

if (unique_crashes >= KEEP_UNIQUE_CRASH) return keeping;

if (!dumb_mode) {

#ifdef WORD_SIZE_64
simplify_trace((u64*)trace_bits);
#else
simplify_trace((u32*)trace_bits);
#endif /* ^WORD_SIZE_64 */

if (!has_new_bits(virgin_crash)) return keeping;

}

if (!unique_crashes) write_crash_readme();

#ifndef SIMPLE_FILES

fn = alloc_printf("%s/crashes/id:%06llu,sig:%02u,%s", out_dir,
unique_crashes, kill_signal, describe_op(0));

#else

fn = alloc_printf("%s/crashes/id_%06llu_%02u", out_dir, unique_crashes,
kill_signal);

#endif /* ^!SIMPLE_FILES */

unique_crashes++;

last_crash_time = get_cur_time();
last_crash_execs = total_execs;

break;

case FAULT_ERROR: FATAL("Unable to execute target application");

default: return keeping;

}

/* If we're here, we apparently want to save the crash or hang
test case, too. */

fd = open(fn, O_WRONLY | O_CREAT | O_EXCL, 0600);
if (fd < 0) PFATAL("Unable to create '%s'", fn);
ck_write(fd, mem, len, fn);
close(fd);

ck_free(fn);

return keeping;

}

如果超时,则将样例保存到hang目录中,如果是崩溃,则保存到crash目录中。

afl变异

https://tttang.com/archive/1796/