libFuzzer-workshop学习

概述

libFuzzer 是一个in-process，coverage-guided，evolutionary 的 fuzz 引擎，是 LLVM 项目的一部分。

libFuzzer 和要被测试的库链接在一起，通过一个模糊测试入口点（目标函数），把测试用例喂给要被测试的库。

fuzzer会跟踪哪些代码区域已经测试过，然后在输入数据的语料库上进行变异，来使代码覆盖率最大化。代码覆盖率的信息由 LLVM 的SanitizerCoverage 插桩提供。

安装

git clone https://github.com/Dor1s/libfuzzer-workshop.git
sudo ln -s /usr/include/asm-generic /usr/include/asm
apt-get install gcc-multilib

然后进入 libfuzzer-workshop/ ，执行 checkout_build_install_llvm.sh 安装好 llvm.
然后进入 libfuzzer-workshop/libFuzzer/Fuzzer/ ，执行 build.sh 编译好 libFuzzer。
如果编译成功，会生成 libfuzzer-workshop/libFuzzer/Fuzzer/libFuzzer.a

中间编译llvm如果报的错误是internal错误，可能是机器内存不够，可通过设置内存大小和swap分区解决。

Lesson

01-04

Modern_Fuzzing_of_C_C++_projects_slides_1-23

简单介绍了下单元测试和fuzz以及modern fuzz。

使用radamsa随机调用seed库，实现对pdfium的简单fuzz
介绍了libfuzzer、覆盖率、常见的memtools (AddressSanitizer,MemorySanitizer,UndefinedBehaviorSanitizer)
简单介绍几个vul函数的fuzz

VulnerableFunction1

bool VulnerableFunction1(const uint8_t* data, size_t size) {
  bool result = false;
  if (size >= 3) {
    result = data[0] == 'F' &&
             data[1] == 'U' &&
             data[2] == 'Z' &&
             data[3] == 'Z';
  }

  return result;
}

data 是缓冲区，size是其大小，当其size大于等于3的时候，访问data[3] 会造成越界访问

Compile the fuzzer in the following way:

clang++ -g -std=c++11 -fsanitize=address -fsanitize-coverage=trace-pc-guard \
    first_fuzzer.cc ../../libFuzzer/libFuzzer.a \
    -o first_fuzzer

这里注意路径调整一下

Create an empty directory for corpus and run the fuzzer:

mkdir corpus1
./first_fuzzer corpus1

VulnerableFunction2

template
typename T::value_type DummyHash(const T& buffer) {
  typename T::value_type hash = 0;
  for (auto value : buffer)
    hash ^= value;

  return hash;
}

constexpr auto kMagicHeader = "ZN_2016";
constexpr std::size_t kMaxPacketLen = 1024;
constexpr std::size_t kMaxBodyLength = 1024 - sizeof(kMagicHeader);

bool VulnerableFunction2(const uint8_t* data, size_t size, bool verify_hash) {
  if (size < sizeof(kMagicHeader))
    return false;

  std::string header(reinterpret_cast(data), sizeof(kMagicHeader));

  std::array body;   // 申请的数组长度为 1024 - sizeof(kMagicHeader)

  if (strcmp(kMagicHeader, header.c_str()))  // 比较前缀是不是ZN_2016
    return false;  

  auto target_hash = data[--size];

  if (size > kMaxPacketLen)
    return false;

  if (!verify_hash)
    return true;

  std::copy(data, data + size, body.data());  // 可以很明显看到这里可能存在溢出 
  auto real_hash = DummyHash(body);
  return real_hash == target_hash;
}

可以看到这个漏洞函数，fuzz程序如下：

// Copyright 2016 Google Inc. All Rights Reserved.
// Licensed under the Apache License, Version 2.0 (the "License");

#include 
#include 

#include "vulnerable_functions.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
  bool flag[2] = {false,true};
  for (auto f : flag)      // 如果不遍历这个bool类型的话，直接传递false跑不出crash
  VulnerableFunction2(data, size,f);
  return 0;
}

如果我们设置一下条件，可以更快的跑出crash

Address 0x7ffedfe60ca8 is located in stack of thread T0 at offset 1128 in frame
    #0 0x4f801f in VulnerableFunction2(unsigned char const*, unsigned long, bool) /home/nevv/libfuzzer-workshop/lessons/04/./vulnerable_functions.h:42

  This frame has 3 object(s):
    [32, 64) 'header' (line 46)
    [96, 97) 'ref.tmp' (line 46)
    [112, 1128) 'body' (line 48) <== Memory access at offset 1128 overflows this variabl

可以看到是body这里溢出了，如果我们一开始不设置最大长度的话，可能fuzz很久都没有crash(路径就这么多，没有找到触发crash的路径)

libfuzzer运行参数

http://llvm.org/docs/LibFuzzer.html#running

copy函数

//fist [IN]: 要拷贝元素的首地址
//last [IN]:要拷贝元素的最后一个元素的下一个地址
//x [OUT] : 拷贝的目的地的首地址
template
  OutIt copy(InIt first, InIt last, OutIt x);

VulnerableFunction3

constexpr std::size_t kZn2016VerifyHashFlag = 0x0001000;

bool VulnerableFunction3(const uint8_t* data, size_t size, std::size_t flags) {
  bool verify_hash = flags & kZn2016VerifyHashFlag;
  return VulnerableFunction2(data, size, verify_hash);
}

直接跟之前一样制定下hash:

// Copyright 2016 Google Inc. All Rights Reserved.
// Licensed under the Apache License, Version 2.0 (the "License");

#include 
#include 

#include "vulnerable_functions.h"

#include 
#include 

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
  std::string data_string(reinterpret_cast(data), size);
  auto data_hash = std::hash()(data_string);

  std::size_t flags = static_cast(data_hash);
  VulnerableFunction3(data, size, flags);
  return 0;
}

05 openssl heartbleed漏洞

漏洞简介

请看ssl/dl_both.c，漏洞的补丁从这行语句开始：

int
dtls1_process_heartbeat(SSL s)
    {
    unsigned char p = &s->s3->rrec.data[0], pl;
    unsigned short hbtype;
    unsigned int payload;
    unsigned int padding = 16; / Use minimum padding /

一上来我们就拿到了一个指向一条SSLv3记录中数据的指针。结构体SSL3_RECORD的定义如下

typedef struct ssl3_record_st
    {
        int type;               / type of record /
        unsigned int length;    / How many bytes available /
        unsigned int off;       / read/write offset into 'buf' /
        unsigned char data;    / pointer to the record data /
        unsigned char input;   / where the decode bytes are /
        unsigned char comp;    / only used with decompression - malloc()ed /
        unsigned long epoch;    / epoch number, needed by DTLS1 /
        unsigned char seq_num[8]; / sequence number, needed by DTLS1 /
    } SSL3_RECORD;

每条SSLv3记录中包含一个类型域（type）、一个长度域（length）和一个指向记录数据的指针（data）。我们回头去看dtls1_process_heartbeat：

/ Read type and payload length first /
hbtype = p++;
n2s(p, payload);
pl = p;

SSLv3记录的第一个字节标明了心跳包的类型。宏n2s从指针p指向的数组中取出前两个字节，并把它们存入变量payload中——这实际上是心跳包载荷的长度域（length）。注意程序并没有检查这条SSLv3记录的实际长度。变量pl则指向由访问者提供的心跳包数据。

这个函数的后面进行了以下工作：

unsigned char buffer, bp;
int r;

/ Allocate memory for the response, size is 1 byte
  message type, plus 2 bytes payload length, plus
  payload, plus padding
 /
buffer = OPENSSL_malloc(1 + 2 + payload + padding);
bp = buffer;

所以程序将分配一段由访问者指定大小的内存区域，这段内存区域最大为 (65535 + 1 + 2 + 16) 个字节。变量bp是用来访问这段内存区域的指针。

/ Enter response type, length and copy payload /
bp++ = TLS1_HB_RESPONSE;
s2n(payload, bp);
memcpy(bp, pl, payload);

宏s2n与宏n2s干的事情正好相反：s2n读入一个16 bit长的值，然后将它存成双字节值，所以s2n会将与请求的心跳包载荷长度相同的长度值存入变量payload。然后程序从pl处开始复制payload个字节到新分配的bp数组中——pl指向了用户提供的心跳包数据。

本质上是openssl处理心跳包的时候对于解析出来的用户可控数据包长度字段没有进行检查，后续的写入导致有可能将server端的数据写入到返回数据包中返回给用户。

fuzz程序

#include 
#include 
#include 
#include 
#include 

#ifndef CERT_PATH
# define CERT_PATH
#endif

SSL_CTX *Init() {
  SSL_library_init();
  SSL_load_error_strings();
  ERR_load_BIO_strings();
  OpenSSL_add_all_algorithms();
  SSL_CTX *sctx;
  assert (sctx = SSL_CTX_new(TLSv1_method()));
  /* These two file were created with this command:
      openssl req -x509 -newkey rsa:512 -keyout server.key \
     -out server.pem -days 9999 -nodes -subj /CN=a/
  */
  assert(SSL_CTX_use_certificate_file(sctx, CERT_PATH "server.pem",
                                      SSL_FILETYPE_PEM));
  assert(SSL_CTX_use_PrivateKey_file(sctx, CERT_PATH "server.key",
                                     SSL_FILETYPE_PEM));
  return sctx;
}

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
  static SSL_CTX *sctx = Init();
  SSL *server = SSL_new(sctx);
  BIO *sinbio = BIO_new(BIO_s_mem());
  BIO *soutbio = BIO_new(BIO_s_mem());
  SSL_set_bio(server, sinbio, soutbio);
  SSL_set_accept_state(server);
  BIO_write(sinbio, data, size);
  SSL_do_handshake(server);
  SSL_free(server);
  return 0;
}

06 c_ares 漏洞

// Copyright 2016 Google Inc. All Rights Reserved.
// Licensed under the Apache License, Version 2.0 (the "License");
#include 
#include 

#include 

#include 

#include 

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
  unsigned char *buf;
  int buflen;
  std::string s(reinterpret_cast(data), size);
  ares_create_query(s.c_str(), ns_c_in, ns_t_a, 0x1234, 0, &buf, &buflen, 0);
  ares_free_string(buf);
  return 0;
}

用 libfuzzer 的话，我们需要做的工作就是根据目标程序的逻辑，把 libfuzzer 生成的测试数据传递给目标程序去处理，然后在编译时采取合适的 Sanitizer 用于检测运行时出现的内存错误就好。抽空还是需要看一下源码以及基于libfuzzer的相关论文~