Glushkov's construction for functional subsequential transducers
Aleksander Mendoza-Drosik

TL;DR
This paper extends Glushkov's construction to create compact, efficient, epsilon-free functional subsequential weighted finite state transducers from a special class of regular expressions, enabling fast automaton evaluation.
Contribution
It introduces a new approach for converting a special class of regular expressions into highly compact, efficient automata suitable for multitape transducer compilation.
Findings
Automata contain only one state per input symbol
Transitions are optimized for binary search lookup
The methods enable efficient compilation of regular expressions into transducers
Abstract
Glushkov's construction has many interesting properties and they become even more evident when applied to transducers. This article strives to show the wast range of possible extensions and optimisations for this algorithm. Special flavour of regular expressions is introduced, which can be efficiently converted to -free functional subsequential weighted finite state transducers. Produced automata are very compact, as they contain only one state for each symbol (from input alphabet) of original expression and only one transition for each range of symbols, no matter how large. Such compactified ranges of transitions allow for efficient binary search lookup during automaton evaluation. All the methods and algorithms presented here were used to implement open-source compiler of regular expressions for multitape transducers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicssemigroups and automata theory · Machine Learning and Algorithms · Logic, programming, and type systems
