Comment on Revisiting Neural Program Smoothing for Fuzzing
Dongdong She, Kexin Pei, Junfeng Yang, Baishakhi Ray, Suman Jana

TL;DR
This paper critically re-evaluates NEUZZ, a machine learning-based fuzzer, identifying critical implementation bugs and evaluation flaws in MLFuzz, and demonstrates that NEUZZ actually performs well when properly tested.
Contribution
The authors correct the flawed evaluation of NEUZZ in MLFuzz, providing a proper implementation and demonstrating its effectiveness over AFL on FuzzBench.
Findings
NEUZZ outperforms AFL when properly evaluated
Implementation bugs in MLFuzz led to incorrect conclusions
Proper data cleaning improves NEUZZ's performance
Abstract
MLFuzz, a work accepted at ACM FSE 2023, revisits the performance of a machine learning-based fuzzer, NEUZZ. We demonstrate that its main conclusion is entirely wrong due to several fatal bugs in the implementation and wrong evaluation setups, including an initialization bug in persistent mode, a program crash, an error in training dataset collection, and a mistake in fuzzing result collection. Additionally, MLFuzz uses noisy training datasets without sufficient data cleaning and preprocessing, which contributes to a drastic performance drop in NEUZZ. We address these issues and provide a corrected implementation and evaluation setup, showing that NEUZZ consistently performs well over AFL on the FuzzBench dataset. Finally, we reflect on the evaluation methods used in MLFuzz and offer practical advice on fair and scientific fuzzing evaluations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Advanced Vision and Imaging
