The Relevance of Classic Fuzz Testing: Have We Solved This One?
Barton P. Miller, Mengxiao Zhang, Elisa R. Heymann

TL;DR
This study revisits classic fuzz testing techniques by applying them to modern Unix utilities across multiple platforms, revealing persistent and new failure modes, and comparing their effectiveness to modern language utilities.
Contribution
The paper updates and applies basic fuzz testing to current Unix utilities, providing a comparative analysis of failure rates and categories over time and across platforms.
Findings
Higher failure rates than previous studies in 1995, 2000, and 2006.
Common failure categories include pointer errors and not checking return codes.
Limited testing of Rust utilities showed no reliability improvement.
Abstract
As fuzz testing has passed its 30th anniversary, and in the face of the incredible progress in fuzz testing techniques and tools, the question arises if the classic, basic fuzz technique is still useful and applicable? In that tradition, we have updated the basic fuzz tools and testing scripts and applied them to a large collection of Unix utilities on Linux, FreeBSD, and MacOS. As before, our failure criteria was whether the program crashed or hung. We found that 9 crash or hang out of 74 utilities on Linux, 15 out of 78 utilities on FreeBSD, and 12 out of 76 utilities on MacOS. A total of 24 different utilities failed across the three platforms. We note that these failure rates are somewhat higher than our in previous 1995, 2000, and 2006 studies of the reliability of command line utilities. In the basic fuzz tradition, we debugged each failed utility and categorized the causes the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
