Practical LR Parser Generation

Joe Zimmerman

arXiv:2209.08383·cs.FL·September 20, 2022

Practical LR Parser Generation

Joe Zimmerman

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel approach to automatically generate efficient LR parsers for a broad class of programming languages, overcoming traditional limitations and demonstrating practical performance improvements over hand-written parsers.

Contribution

It presents new algorithms and extensions, including automata optimization, grammar transformation, and the XLR extension, enabling automatic parser generation for a wide range of languages.

Findings

01

Generated parsers are 1.2x faster than hand-written Golang parser.

02

Generated parsers are 4.3x faster than CPython parser.

03

The approach supports a broad class of practical grammars.

Abstract

Parsing is a fundamental building block in modern compilers, and for industrial programming languages, it is a surprisingly involved task. There are known approaches to generate parsers automatically, but the prevailing consensus is that automatic parser generation is not practical for real programming languages: LR/LALR parsers are considered to be far too restrictive in the grammars they support, and LR parsers are often considered too inefficient in practice. As a result, virtually all modern languages use recursive-descent parsers written by hand, a lengthy and error-prone process that dramatically increases the barrier to new programming language development. In this work we demonstrate that, contrary to the prevailing consensus, we can have the best of both worlds: for a very general, practical class of grammars -- a strict superset of Knuth's canonical LR -- we can generate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jzimmerman/langcc
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLogic, programming, and type systems · Software Testing and Debugging Techniques · Formal Methods in Verification