Learning Inputs in Greybox Fuzzing
Valentin W\"ustholz, Maria Christakis

TL;DR
This paper introduces a learning-based input generation method for greybox fuzzing that significantly improves code coverage and bug detection efficiency on real-world benchmarks.
Contribution
It extends greybox fuzzing with a technique for learning inputs from explored executions to better target complex code paths.
Findings
Up to 3X increase in path coverage
Up to 38% more bugs detected
Faster exploration compared to traditional fuzzing
Abstract
Greybox fuzzing is a lightweight testing approach that effectively detects bugs and security vulnerabilities. However, greybox fuzzers randomly mutate program inputs to exercise new paths; this makes it challenging to cover code that is guarded by complex checks. In this paper, we present a technique that extends greybox fuzzing with a method for learning new inputs based on already explored program executions. These inputs can be learned such that they guide exploration toward specific executions, for instance, ones that increase path coverage or reveal vulnerabilities. We have evaluated our technique and compared it to traditional greybox fuzzing on 26 real-world benchmarks. In comparison, our technique significantly increases path coverage (by up to 3X) and detects more bugs (up to 38% more), often orders-of-magnitude faster.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Software Reliability and Analysis Research · Advanced Malware Detection Techniques
