
TL;DR
This paper introduces two innovative parsing methods for context-free languages, extending derivatives to grammars and parser combinators, resulting in a compact, efficient parsing library suitable for practical use.
Contribution
It presents novel derivative-based parsing techniques for context-free grammars, enabling small, efficient, and easy-to-implement parsers in Scala and Haskell.
Findings
Parsed millions of tokens per second in experiments
The parsing library is less than 250 lines of code
Techniques are practical for real-world applications
Abstract
We present two novel approaches to parsing context-free languages. The first approach is based on an extension of Brzozowski's derivative from regular expressions to context-free grammars. The second approach is based on a generalization of the derivative to parser combinators. The payoff of these techniques is a small (less than 250 lines of code), easy-to-implement parsing library capable of parsing arbitrary context-free grammars into lazy parse forests. Implementations for both Scala and Haskell are provided. Preliminary experiments with S-Expressions parsed millions of tokens per second, which suggests this technique is efficient enough for use in practice.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLogic, programming, and type systems · Natural Language Processing Techniques · Advanced Database Systems and Queries
