Formally Verified Linear-Time Invertible Lexing

Samuel Chassot; Viktor Kun\v{c}ak

arXiv:2510.18479·cs.PL·May 22, 2026

Formally Verified Linear-Time Invertible Lexing

Samuel Chassot, Viktor Kun\v{c}ak

PDF

TL;DR

ZipLex is a verified, invertible, linear-time lexical analysis framework that guarantees mutual invertibility of lexing and printing, supporting realistic applications with high performance.

Contribution

It introduces ZipLex, a novel verified framework that ensures invertibility and linear-time performance in lexical analysis, surpassing previous approaches in efficiency and correctness.

Findings

01

ZipLex guarantees mutual invertibility of lexing and printing.

02

ZipLex achieves linear-time lexing for realistic applications.

03

ZipLex outperforms Verbatim++ by two orders of magnitude in speed.

Abstract

We present ZipLex, a verified framework for invertible linear-time lexical analysis following the longest match (maximal munch) semantics. Unlike past verified lexers that focus only on satisfying the semantics of regular expressions and the longest match property, ZipLex also guarantees that lexing and printing are mutual inverses. Thanks to verified memoization, it also ensures that the lexical analysis of a string is linear in the size of the string. Our design and implementation rely on two sets of ideas: (1) a new abstraction of token sequences that captures the separability of tokens in a sequence while supporting their efficient manipulation, and (2) a combination of verified data structures and optimizations, including Huet's zippers and memoization with a standalone verified imperative hash table. Our hash table offers competitive performance as shown by our evaluation. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLogic, programming, and type systems · Security and Verification in Computing · Parallel Computing and Optimization Techniques