Random Wheeler Automata
Ruben Becker, Davide Cenzato, Sung-Hwan Kim, Bojana Kodric, Riccardo, Maso, Nicola Prezza

TL;DR
This paper introduces a method to generate uniform random Wheeler automata, specifically deterministic ones, providing theoretical foundations, efficient algorithms, and practical implementation details for large-scale dataset generation.
Contribution
It extends the Erdős-Rényi model to Wheeler DFAs and offers an efficient algorithm for uniform generation, along with formulas for counting and encoding such automata.
Findings
Algorithm generates a uniform WDFA in expected O(m) time
Formulas for the number of distinct WDFAs are provided
Implementation achieves over 8 million transitions per second
Abstract
Wheeler automata were introduced in 2017 as a tool to generalize existing indexing and compression techniques based on the Burrows-Wheeler transform. Intuitively, an automaton is said to be Wheeler if there exists a total order on its states reflecting the co-lexicographic order of the strings labeling the automaton's paths; this property makes it possible to represent the automaton's topology in a constant number of bits per transition, as well as efficiently solving pattern matching queries on its accepted regular language. After their introduction, Wheeler automata have been the subject of a prolific line of research, both from the algorithmic and language-theoretic points of view. A recurring issue faced in these studies is the lack of large datasets of Wheeler automata on which the developed algorithms and theories could be tested. One possible way to overcome this issue is to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
