Derivative-Based Diagnosis of Regular Expression Ambiguity
Martin Sulzmann, Kenny Zhuo Ming Lu

TL;DR
This paper introduces a derivative-based approach using Brzozowski's derivatives to diagnose ambiguity in regular expressions, providing tools to generate parse trees and counter-examples, and compare disambiguation policies.
Contribution
It presents a novel derivative-based finite state transducer for diagnosing regular expression ambiguity and comparing disambiguation policies.
Findings
Can generate parse trees and minimal counter-examples
Allows comparison between POSIX and Greedy policies
Facilitates diagnosis of ambiguity in regular expressions
Abstract
Regular expressions are often ambiguous. We present a novel method based on Brzozowski's derivatives to aid the user in diagnosing ambiguous regular expressions. We introduce a derivative-based finite state transducer to generate parse trees and minimal counter-examples. The transducer can be easily customized to either follow the POSIX or Greedy disambiguation policy and based on a finite set of examples it is possible to examine if there are any differences among POSIX and Greedy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · semigroups and automata theory · Logic, programming, and type systems
