Packrat Parsing: Simple, Powerful, Lazy, Linear Time

Bryan Ford

arXiv:cs/0603077·cs.DS·May 23, 2007·84 cites

Packrat Parsing: Simple, Powerful, Lazy, Linear Time

Bryan Ford

PDF

Open Access

TL;DR

Packrat parsing is a simple, powerful, and linear-time parsing technique that combines the flexibility of backtracking with the efficiency of linear algorithms, enabling easier handling of complex grammars in functional programming.

Contribution

This paper introduces packrat parsing, a novel approach that achieves linear-time parsing with backtracking and unlimited lookahead, expanding the capabilities of traditional top-down parsers.

Findings

01

Recognizes any LL(k) or LR(k) grammar

02

Simplifies handling of syntactic idioms like longest-match rule

03

Enables integration of lexical analysis into parsing

Abstract

Packrat parsing is a novel technique for implementing parsers in a lazy functional programming language. A packrat parser provides the power and flexibility of top-down parsing with backtracking and unlimited lookahead, but nevertheless guarantees linear parse time. Any language defined by an LL(k) or LR(k) grammar can be recognized by a packrat parser, in addition to many languages that conventional linear-time algorithms do not support. This additional power simplifies the handling of common syntactic idioms such as the widespread but troublesome longest-match rule, enables the use of sophisticated disambiguation strategies such as syntactic and semantic predicates, provides better grammar composition properties, and allows lexical analysis to be integrated seamlessly into parsing. Yet despite its power, packrat parsing shares the same simplicity and elegance as recursive descent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Logic, programming, and type systems · semigroups and automata theory