Incremental Construction of Minimal Acyclic Sequential Transducers from Unsorted Data
Wojciech Skut

TL;DR
This paper introduces an efficient, unsorted data-compatible algorithm for incrementally building minimal acyclic sequential transducers, improving upon previous methods that required sorted input data, with applications in pronunciation dictionaries.
Contribution
It generalizes existing automata construction methods to handle unsorted data for sequential transducers, eliminating the sorting requirement present in prior algorithms.
Findings
Algorithm efficiently constructs minimal acyclic sequential transducers from unsorted data.
The method is applicable to pronunciation dictionaries.
It outperforms previous algorithms that required sorted input.
Abstract
This paper presents an efficient algorithm for the incremental construction of a minimal acyclic sequential transducer (ST) for a dictionary consisting of a list of input and output strings. The algorithm generalises a known method of constructing minimal finite-state automata (Daciuk et al. 2000). Unlike the algorithm published by Mihov and Maurel (2001), it does not require the input strings to be sorted. The new method is illustrated by an application to pronunciation dictionaries.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicssemigroups and automata theory · Algorithms and Data Compression · Machine Learning and Algorithms
