Don't Panic! Better, Fewer, Syntax Errors for LR Parsers
Lukas Diekmann, Laurence Tratt

TL;DR
This paper introduces the CPCT+ algorithm for LR parser error recovery, significantly improving accuracy and reducing cascading errors in real-world Java programs by providing optimal repair sequences.
Contribution
The paper presents the CPCT+ algorithm, a novel error recovery method for LR parsers that reports all minimal cost repairs and reduces cascading errors effectively.
Findings
Repairs 98.37% of invalid Java files within 0.5s
Reports fewer cascading errors than panic mode
Provides complete set of minimal cost repair sequences
Abstract
Syntax errors are generally easy to fix for humans, but not for parsers in general nor LR parsers in particular. Traditional 'panic mode' error recovery, though easy to implement and applicable to any grammar, often leads to a cascading chain of errors that drown out the original. More advanced error recovery techniques suffer less from this problem but have seen little practical use because their typical performance was seen as poor, their worst case unbounded, and the repairs they reported arbitrary. In this paper we introduce the CPCT+ algorithm, and an implementation of that algorithm, that address these issues. First, CPCT+ reports the complete set of minimum cost repair sequences for a given location, allowing programmers to select the one that best fits their intention. Second, on a corpus of 200,000 real-world syntactically invalid Java programs, CPCT+ is able to repair 98.37%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
