langcc: A Next-Generation Compiler Compiler
Joe Zimmerman

TL;DR
langcc introduces a new compiler generator that automates parser creation for many programming languages, significantly improving efficiency and reducing errors compared to traditional hand-written parsers.
Contribution
It presents a novel approach to automatic parser generation based on innovations in the LR parsing paradigm, enabling efficient parsing of complex languages.
Findings
Generated parsers for Golang and Python are 1.2x and 4.3x faster than standard parsers.
The methodology simplifies parser development for languages that are easy to parse.
The software implementation is open-source and publicly available.
Abstract
Traditionally, parsing has been a laborious and error-prone component of compiler development, and most parsers for full industrial programming languages are still written by hand. The author [Zim22] shows that automatic parser generation can be practical, via a number of new innovations upon the standard LR paradigm of Knuth et al. With this methodology, we can automatically generate efficient parsers for virtually all languages that are intuitively "easy to parse". This includes Golang 1.17.8 and Python 3.9.12, for which our generated parsers are, respectively, 1.2x and 4.3x faster than the standard parsers. This document is a companion technical report which describes the software implementation of that work, which is available open-source at https://github.com/jzimmerman/langcc.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Software Engineering Research · Computational Physics and Python Applications
