A Framework to Evaluate Verification Methods by Automatically Injecting Bugs into SoC IPs

Hardware fuzzing is an emerging technique in SoC verification. Recently, there has been a lot of buzz around fuzzers like Google’s “HW like SW”, Intel PreSiFuzz, HyperFuzzer, RFuzz, and many FIRRTL/Chisel fuzzers. However, measuring how effective different fuzzers are against real-world designs can be extremely challenging. This project aims to create a framework that can evaluate various verification methods by automatically injecting bugs into standard SoC IPs from popular open source repositories such as PULP, TAXI, ZipCPU and proprietary IPs from ARM and Xilinx, giving researchers a standardized benchmark to measure their fuzzer’s performance.
Features
- Unified Harness: A standardized AXI harness that consumes packed binary commands from standard input, mapping them seamlessly to AXI reads, writes, and parallel transactions.
- Broad Tool Support: Integrated with both Verilator and Synopsys VCS simulation environments.
- Multiple Supported Fuzzers: Built-in scripts to run fuzzers seamlessly (
hw_like_sw,hyperfuzzer,presifuzz,hw_fuzzing_afl,rfuzz, etc.) - Industrial IPCores: Evaluates against AXI IPs from sources like PULP, TAXI, ARM, and Xilinx.
- Containerized Environment: Full Docker support so anyone can run the entire suite of fuzzers with a single
make run_examplecommand.
Technical Details
Fuzzers generate randomly mutated bytes to be input into the hardware’s pins. For protocol based designs, randomly mutating the inputs cause almost all inputs to be invalid. A bus grammar and a harness it needed to bridge this gap.
I created a SystemVerilog module that acts as a drop in replacement for the randomized AXI driver in CRV testbenches. This module pulls in data bytes straight from the fuzzers (via /dev/stdin) and translates those bytes into standard AXI signals (AW, W, B, AR, R). I packed over 30 separate timing parameters, control attributes, burst sizes, and opcode commands into a command vectors.
For timing and performance, wait cycles (t_aw_m2s, t_r_s2m, etc.) have their own byte slots in the vector and are natively supported by the harness, allowing the fuzzer to create arbitrarily long or short stalls on the handshake signals. I have SystemVerilog queues inside the design to connect our drop in AXI Source and Sink, so the delays can be inserted on both ends.
If the command randomly generated by a fuzzer breaks fundamental AXI protocol rules, it’s legalized in the params.svh before being issued, ensuring the fuzzer doesn’t get stuck chasing invalid protocol states. I also added an extra byte which defines different ways to violate AXI protocol, to catch the bugs from the module behaving unexpectedly. We have AXI protocol checkers from ARM (SVA based) and ZipCPU in each of the Source and Sink to identify the DUT violating protocol, cause a crash sending a signal to the fuzzer.
To run our harness,
make fuzz_axi FUZZER=hw_like_sw IP=axi_fifo VENDOR=pulp SIM=verilator
The system automatically spins up the target IP, binds the harness, and begins bombarding the design with fuzzed packed bytes to uncover edge-case behavioral bugs or security violations. By standardizing this, we can easily inject known bugs and quantify precisely which fuzzers identify the faults earliest and with the least overhead.