Formal Verification for JavaScript Regular Expressions: a Proven Semantics and its Applications (Extended Version)
Aur\`ele Barri\`ere, Victor Deng, Cl\'ement Pit-Claudel

TL;DR
This paper introduces a complete, mechanized semantics for JavaScript regular expressions with backtracking, enabling formal verification and practical applications like equivalence checking and algorithm proofs.
Contribution
It provides the first faithful formal semantics for modern JavaScript regexes, including a backtracking tree, and demonstrates its utility through real-world applications.
Findings
Proved equivalence to ECMAScript specification embedding.
Formalized the PikeVM algorithm.
Developed a mechanized semantics in Rocq proof assistant.
Abstract
We present the first mechanized, succinct, practical, complete, and proven-faithful semantics for a modern regular expression language with backtracking semantics. We ensure its faithfulness by proving it equivalent to a preexisting line-by-line embedding of the official ECMAScript specification of JavaScript regular expressions. We demonstrate its practicality by presenting two real-world applications. First, a new notion of contextual equivalence for modern regular expressions, which we use to prove or disprove rewrites drawn from previous work. Second, the first formal proof of the PikeVM algorithm used in many real-world engines. In contrast with the specification and other formalization work, our semantics captures not only the top-priority match, but a full backtracking tree recording all possible matches and their respective priority. All our definitions and results have been…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
