Formally Verified Linear-Time Invertible Lexing
Samuel Chassot, Viktor Kun\v{c}ak

TL;DR
ZipLex is a verified, invertible, linear-time lexical analysis framework that guarantees mutual invertibility of lexing and printing, supporting realistic applications with high performance.
Contribution
It introduces ZipLex, a novel verified framework that ensures invertibility and linear-time performance in lexical analysis, surpassing previous approaches in efficiency and correctness.
Findings
ZipLex guarantees mutual invertibility of lexing and printing.
ZipLex achieves linear-time lexing for realistic applications.
ZipLex outperforms Verbatim++ by two orders of magnitude in speed.
Abstract
We present ZipLex, a verified framework for invertible linear-time lexical analysis following the longest match (maximal munch) semantics. Unlike past verified lexers that focus only on satisfying the semantics of regular expressions and the longest match property, ZipLex also guarantees that lexing and printing are mutual inverses. Thanks to verified memoization, it also ensures that the lexical analysis of a string is linear in the size of the string. Our design and implementation rely on two sets of ideas: (1) a new abstraction of token sequences that captures the separability of tokens in a sequence while supporting their efficient manipulation, and (2) a combination of verified data structures and optimizations, including Huet's zippers and memoization with a standalone verified imperative hash table. Our hash table offers competitive performance as shown by our evaluation. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLogic, programming, and type systems · Security and Verification in Computing · Parallel Computing and Optimization Techniques
