On the Disambiguation of Weighted Automata
Mehryar Mohri, Michael D. Riley

TL;DR
This paper introduces a disambiguation algorithm for weighted automata, which can produce smaller representations than determinization, especially useful in applications like speech recognition and machine translation.
Contribution
The paper presents a novel disambiguation algorithm for weighted automata, including conditions for its applicability and demonstrating space efficiency benefits.
Findings
Algorithm is effective for acyclic weighted automata.
Disambiguation can produce exponentially smaller automata than determinization.
Empirical results show space savings in speech recognition and machine translation.
Abstract
We present a disambiguation algorithm for weighted automata. The algorithm admits two main stages: a pre-disambiguation stage followed by a transition removal stage. We give a detailed description of the algorithm and the proof of its correctness. The algorithm is not applicable to all weighted automata but we prove sufficient conditions for its applicability in the case of the tropical semiring by introducing the *weak twins property*. In particular, the algorithm can be used with all acyclic weighted automata, relevant to applications. While disambiguation can sometimes be achieved using determinization, our disambiguation algorithm in some cases can return a result that is exponentially smaller than any equivalent deterministic automaton. We also present some empirical evidence of the space benefits of disambiguation over determinization in speech recognition and machine translation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicssemigroups and automata theory · Natural Language Processing Techniques · Logic, programming, and type systems
