Synthesizing Program Input Grammars
Osbert Bastani, Rahul Sharma, Alex Aiken, Percy Liang

TL;DR
This paper introduces GLADE, an algorithm that synthesizes accurate context-free grammars from input examples and program access, significantly improving fuzz testing coverage of structured inputs.
Contribution
The paper presents a novel algorithm for grammar synthesis that overcomes overgeneralization and speed issues of prior methods, enabling more effective fuzz testing.
Findings
GLADE increases coverage of valid inputs in fuzz testing.
The algorithm outperforms existing grammar inference methods.
Structured input fuzzing effectiveness is improved.
Abstract
We present an algorithm for synthesizing a context-free grammar encoding the language of valid program inputs from a set of input examples and blackbox access to the program. Our algorithm addresses shortcomings of existing grammar inference algorithms, which both severely overgeneralize and are prohibitively slow. Our implementation, GLADE, leverages the grammar synthesized by our algorithm to fuzz test programs with structured inputs. We show that GLADE substantially increases the incremental coverage on valid inputs compared to two baseline fuzzers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Parallel Computing and Optimization Techniques
