SGXFuzz: Efficiently Synthesizing Nested Structures for SGX Enclave Fuzzing
| Links: |
Published at Usenix Security 2022
SGXFuzz presents a novel approach to fuzz SGX enclaves in a user-space environment including the synthesis of ECall structures that automatically synthesizes a nested input structure as expected by the enclaves using a binary-only approach. The prototype consists of an enclave dumper that extracts enclaves memory from distribution formats, a fuzzing setup to fuzz extracted enclave, as well as a series of scripts to perform result aggregation. The fuzzing setup is the core of SGXFuzz and is built upon the kAFL fuzzer and the Nyx snapshotting and fuzzing engine. We extend the existing code of kAFL to accommodate our structure synthesis in Python. The Nyx fuzzing engine utilizes the Intel PT CPU extension to get code coverage information but does not contain any changes for SGXFuzz. Finally, we provide several scripts to process the crashes found during the fuzzing campaigns as well as the synthesized structure layouts.
SGXFuzz consists of the enclave dumper, enclave runner, the fuzzing setup, and the enclaves evaluated in the paper. The enclave dumper extracts the enclave memory from enclave distribution formats. This step has to be done only once per enclave, and we have already performed that step for all enclaves. The enclave runner uses the previously extracted enclave memory to run the enclave as a regular user-space process. The runner is a C++ program that loads the enclave memory, handles the emulation of the context switch that would usually be performed by the SGX instruction set and performs the structure allocation for each input. Finally, our fuzzing setup consists of a front end that generates fuzzing inputs and performs the structure synthesis, and a back end that executes the target and collects coverage. We use kAFL as a foundation for our fuzzing front end and add new code to the fuzzer to perform the structure synthesis. The back end consists of a patched version of QEMU and KVM to allow the collection of coverage data using the Intel PT CPU extension. We did not perform any modifications on the fuzzing back end.
Our fuzzing back end consisting of a modified QEMU and KVM uses the Intel PT CPU extension to collect coverage data. Thus, an Intel PT-enabled CPU is required to use our fuzzing setup. However, Intel PT does not work in a virtualized environment and as such, cannot run in VM. Notice that the Intel SGX is not required at any point.
We include a setup script that should perform the major steps.
First, disable SGX in the BIOS if supported by the CPU.
Clone the repository.
Install required packages:
sudo apt install \ python2 python3 libpixman-1-dev pax-utils bc \ make cmake gcc g++ pkg-config unzip \ python3-virtualenv python2-dev python3-dev \ libglib2.0-dev+ cpio gzip
Then, you can use setup.sh to compile and install the components, or follow the steps manually. That is:
- Initialize the submodules:
git submodule update --init --recursive --depth=1
- QEMU-Nyx: https://github.com/nyx-fuzz/QEMU-Nyx#build
- KVM-Nyx: https://github.com/nyx-fuzz/KVM-Nyx#setup-kvm-nyx-binaries
- (Virtual) environments for python2 and python3 and install
- python2: configparser mmh3 lz4 psutil ipdb msgpack inotify
- python3: six python-dateutil msgpack mmh3 lz4 psutil fastrand inotify pgrep tqdm hexdump
- Install zydis (
cd zydis && cmake -DCMAKE_C_FLAGS="-fPIC" -DCMAKE_CXX_FLAGS="-fPIC" . && sudo make install && dependencies/zycore && mkdir build && cd build && cmake .. && make && sudo make install)
- Prepare the ramfs guest image (
cd packer/linux_initramfs/ && ./pack.sh)
- Initialize the submodules:
The experiment workflow includes three main parts: Enclave dumping, Fuzzing, Result aggregation.
First, enclave dumping is used to extract the enclave memory. It is based on
the Linux SGX SDK. By providing the enclave dumper with
a memory dump with the name
enclave.signed.so.mem, a memory layout
enclave.signed.so.layout, and the address of the enclave’s entry point
(specifically the offset of the TCS)
- Compile is using:
make -C ./enclave-dumper/
- Run it using:
To fuzz the previously extracted enclave, several steps are involved. We bundled all of them together in a script that runs a minimal fuzzing test:
The script runs the following steps automatically. First, the enclave runner is compiled using
make-enclave-fuzz-target.sh enclave.signed.so.mem \ enclave.signed.so.tcs.txt
The result of the compilation is a
fuzz-generic binary, which is the
user-space version of the enclave, and a
liblibnyx_dummy.so, which is
required for the fuzzer.
In the next step, the fuzzing target is packed into a VM that is executed using the QEMU-KVM setup. The packer script can be called as follows
nyx_packer.py <enclave-runner> <fuzz-folder> m64 \ --legacy --purge --no_pt_auto_conf_b \ --fast_reload_mode --delayed_init
Finally, the fuzzing can be started using the kAFL fuzzing frontend. The exact
command can be found in the
If desired, manually crafted seeds can be added to the \verb+imports+ folder. Each seed is a file consisting of the ECall ID, the serialized structure definition, and the contents of the buffers.
Display synthesized structures:
display-structs.py <path/to/fuzzing-workdir> \ <ecall_index>
The script displays the evolvement of the synthesized structure in a tree format for each ECall index, with the ECall index being zero-based. The leaves show the final evolvement of the synthesized structures. Each leave shows the synthesized structure in a specific format.
Structures are serialized, e.g.,
40 2 C8 4 0 C24 7 0, and read left to
right. This string denotes a structure of 40\,Bytes, which has two
(2) child structures (C). The first child is at offset 8 of the
parent and is defined the same way: A size of 4 and zero(0) children.
The second child has a size of 7 and also zero children. Further, the
sizes may be annotated with their address (
Additional types include buffers partially (on the edge) of the enclave’s
memory (P) and SizeOf (S) buffers of which the size is written to a defined
This script shows how to parse and dump these strings:
analyze_crashes.py <eval-dir> \ -0 --np --no-ptr-0x7ff --no-large-diff
analyze_crashes.py script iterates through all crashes found by the fuzzer.
The script performs filtering and will only display valid crashes. However,
manual duplication of the crashes is required. The flags supplied to the
script do the filtering according to the described filtering techniques in
the paper. The script displays useful information to understand the crash:
the ECall ID, the signal(usually Segmentation Fault), pc
(absolute/relative), the disassembled instruction, and addresses used for