My Fuzzer Beats Them All! Developing a Framework for Fair Evaluation and Comparison of Fuzzers
David Paa{\ss}en, Sebastian Surminski, Michael Rodler, Lucas Davi

TL;DR
This paper introduces SENF, a comprehensive statistical framework for fair and rigorous evaluation of fuzzers, addressing inconsistencies and biases in current fuzzing assessments.
Contribution
The paper presents SENF, a novel evaluation framework that incorporates statistical techniques for fair comparison of fuzzers, and demonstrates its effectiveness through empirical analysis.
Findings
Small parameter changes significantly affect fuzzer performance metrics
Evaluation parameters like repetitions and runtime influence results substantially
Using SENF leads to more reliable and consistent fuzzer comparisons
Abstract
Fuzzing has become one of the most popular techniques to identify bugs in software. To improve the fuzzing process, a plethora of techniques have recently appeared in academic literature. However, evaluating and comparing these techniques is challenging as fuzzers depend on randomness when generating test inputs. Commonly, existing evaluations only partially follow best practices for fuzzing evaluations. We argue that the reason for this are twofold. First, it is unclear if the proposed guidelines are necessary due to the lack of comprehensive empirical data in the case of fuzz testing. Second, there does not yet exist a framework that integrates statistical evaluation techniques to enable fair comparison of fuzzers. To address these limitations, we introduce a novel fuzzing evaluation framework called SENF (Statistical EvaluatioN of Fuzzers). We demonstrate the practical applicability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
