Not all bytes are equal: Neural byte sieve for fuzzing
Mohit Rajpal, William Blum, Rishabh Singh

TL;DR
This paper introduces a neural network-guided fuzzing technique that learns patterns from past fuzzing data to improve the efficiency of generating malicious inputs, significantly enhancing code coverage and bug discovery.
Contribution
It presents a novel neural network-based approach integrated with AFL to guide fuzzing mutations, improving effectiveness over traditional uniform random methods.
Findings
Increased code coverage across multiple formats.
Higher number of unique code paths discovered.
More crashes found compared to baseline fuzzing.
Abstract
Fuzzing is a popular dynamic program analysis technique used to find vulnerabilities in complex software. Fuzzing involves presenting a target program with crafted malicious input designed to cause crashes, buffer overflows, memory errors, and exceptions. Crafting malicious inputs in an efficient manner is a difficult open problem and often the best approach to generating such inputs is through applying uniform random mutations to pre-existing valid inputs (seed files). We present a learning technique that uses neural networks to learn patterns in the input files from past fuzzing explorations to guide future fuzzing explorations. In particular, the neural models learn a function to predict good (and bad) locations in input files to perform fuzzing mutations based on the past mutations and corresponding code coverage information. We implement several neural models including LSTMs and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Advanced Malware Detection Techniques · Software Engineering Research
