On Indexing and Compressing Finite Automata
Nicola Cotumaccio, Nicola Prezza

TL;DR
This paper introduces a novel indexing method for finite automata based on a partial co-lexicographic order, enabling efficient pattern matching and compression, and explores the complexity and bounds of automata transformations.
Contribution
It presents the first solution for indexing arbitrary finite automata using a partial order, generalizes the Burrows-Wheeler transform for automata, and analyzes related complexity bounds.
Findings
Provides an encoding for NFAs and DFAs based on the order width p.
Shows indexed pattern matching in NFAs can be done in O(m p^2) time.
Proves the NP-hardness of indexing NFAs and bounds the size of DFA from NFA powerset construction.
Abstract
An index for a finite automaton is a powerful data structure that supports locating paths labeled with a query pattern, thus solving pattern matching on the underlying regular language. In this paper, we solve the long-standing problem of indexing arbitrary finite automata. Our solution consists in finding a partial co-lexicographic order of the states and proving, as in the total order case, that states reached by a given string form one interval on the partial order, thus enabling indexing. We provide a lower bound stating that such an interval requires words to be represented, being the order's width (i.e. the size of its largest antichain). Indeed, we show that determines the complexity of several fundamental problems on finite automata: (i) Letting be the alphabet size, we provide an encoding for NFAs using …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
