On the Complexity of Computing the Co-lexicographic Width of a Regular Language
Ruben Becker, Davide Cenzato, Sung-Hwan Kim, Tomasz Kociumaka, Bojana, Kodric, Alberto Policriti, Nicola Prezza

TL;DR
This paper studies the computational complexity of determining the co-lexicographic width of regular languages, providing new algorithms and matching lower bounds, and settling the problem's complexity status.
Contribution
The paper introduces an $O(m^p)$ time algorithm for deciding if the co-lex width is less than a given $p$, and establishes a matching conditional lower bound, clarifying the problem's complexity.
Findings
Decidable in $O(m^p)$ time for DFA-recognized languages
Conditional lower bound based on the Strong Exponential Time Hypothesis
PSPACE-complete for NFA inputs
Abstract
Co-lex partial orders were recently introduced in (Cotumaccio et al., SODA 2021 and JACM 2023) as a powerful tool to index finite state automata, with applications to regular expression matching. They generalize Wheeler orders (Gagie et al., Theoretical Computer Science 2017) and naturally reflect the co-lexicographic order of the strings labeling source-to-node paths in the automaton. Briefly, the co-lex width of a finite-state automaton measures how sortable its states are with respect to the co-lex order among the strings they accept. Automata of co-lex width can be compressed to bits per edge and admit regular expression matching algorithms running in time proportional to per matched character. The deterministic co-lex width of a regular language is the smallest width of such a co-lex order, among all DFAs recognizing . Since…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · semigroups and automata theory · Authorship Attribution and Profiling
