TL;DR
QuartetFuzz is an autonomous system that generates and verifies fuzz harnesses using a novel Four Principles framework, significantly improving correctness and reducing false positives in fuzz testing across multiple programming languages.
Contribution
It introduces the Four Principles framework for harness correctness and operationalizes it in an LLM-based autonomous system for systematic harness generation and validation.
Findings
Deployed on 23 projects, generated 42 bug reports, with 29 fixed or confirmed upstream.
Intercepted 58 false-positive crashes during harness generation.
Identified 53 violations in existing harnesses, with 45 confirmed and 35 fixed.
Abstract
Fuzz testing is the dominant technique for finding memory-safety vulnerabilities in C/C++ software, yet its effectiveness hinges on the quality of fuzz harnesses -- the programs that bridge fuzzers and library APIs. A growing body of tools now automate harness generation, but none systematically ensures the correctness of produced harnesses: logic errors, API misuse, and lifecycle violations go undetected at the source level. As LLM-driven generation scales harness creation, uncontrolled quality turns scale into a liability. We present QuartetFuzz, an autonomous harness-generation system that systematically improves correctness throughout the generation process. At its core is the Four Principles framework -- Logic Correctness (P1), API Protocol Compliance (P2), Security Boundary Respect (P3), and Entry Point Adequacy (P4) -- the first source-level definition of harness correctness…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
