Speech Recognition by Composition of Weighted Finite Automata
Fernando C. N. Pereira, Michael D. Riley (AT&T Research)

TL;DR
This paper introduces a unified framework using weighted finite automata and transducers for speech recognition, enabling efficient combination and optimization of various information sources like language models, dictionaries, and acoustic data.
Contribution
It presents a novel, general composition algorithm that integrates multiple information sources in speech recognition within a unified automata-based framework.
Findings
Unified representation of recognition components
Efficient algorithms for source combination and optimization
Single composition algorithm for static and dynamic integration
Abstract
We present a general framework based on weighted finite automata and weighted finite-state transducers for describing and implementing speech recognizers. The framework allows us to represent uniformly the information sources and data structures used in recognition, including context-dependent units, pronunciation dictionaries, language models and lattices. Furthermore, general but efficient algorithms can used for combining information sources in actual recognizers and for optimizing their application. In particular, a single composition algorithm is used both to combine in advance information sources such as language models and dictionaries, and to combine acoustic observations and information sources dynamically during recognition.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Speech Recognition and Synthesis · semigroups and automata theory
