Synthesizing Program Input Grammars

Osbert Bastani; Rahul Sharma; Alex Aiken; Percy Liang

arXiv:1608.01723·cs.PL·June 19, 2017

Synthesizing Program Input Grammars

Osbert Bastani, Rahul Sharma, Alex Aiken, Percy Liang

PDF

Open Access 1 Repo

TL;DR

This paper introduces GLADE, an algorithm that synthesizes accurate context-free grammars from input examples and program access, significantly improving fuzz testing coverage of structured inputs.

Contribution

The paper presents a novel algorithm for grammar synthesis that overcomes overgeneralization and speed issues of prior methods, enabling more effective fuzz testing.

Findings

01

GLADE increases coverage of valid inputs in fuzz testing.

02

The algorithm outperforms existing grammar inference methods.

03

Structured input fuzzing effectiveness is improved.

Abstract

We present an algorithm for synthesizing a context-free grammar encoding the language of valid program inputs from a set of input examples and blackbox access to the program. Our algorithm addresses shortcomings of existing grammar inference algorithms, which both severely overgeneralize and are prohibitively slow. Our implementation, GLADE, leverages the grammar synthesized by our algorithm to fuzz test programs with structured inputs. We show that GLADE substantially increases the incremental coverage on valid inputs compared to two baseline fuzzers.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

obastani/glade
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Parallel Computing and Optimization Techniques