A Treatise On FST Lattice Based MMI Training
Adnan Haider, Tim Ng, Zhen Huang, Xingyu Na, Antti Veikko Rosti

TL;DR
This paper investigates the implicit modeling decisions in FST lattice based MMI training for speech recognition, emphasizing the importance of on-the-fly FST determinization to improve model discrimination and reduce WER.
Contribution
It introduces the use of on-the-fly FST lattice determinization in MMI training, demonstrating its effectiveness in improving speech recognition accuracy.
Findings
On-the-fly FST determinization guarantees hypothesis discrimination.
Empirical WER reduction of 2.3-4.6% on Mandarin and English datasets.
Mathematical proof of discrimination guarantee with on-the-fly determinization.
Abstract
Maximum mutual information (MMI) has become one of the two de facto methods for sequence-level training of speech recognition acoustic models. This paper aims to isolate, identify and bring forward the implicit modelling decisions induced by the design implementation of standard finite state transducer (FST) lattice based MMI training framework. The paper particularly investigates the necessity to maintain a preselected numerator alignment and raises the importance of determinizing FST denominator lattices on the fly. The efficacy of employing on the fly FST lattice determinization is mathematically shown to guarantee discrimination at the hypothesis level and is empirically shown through training deep CNN models on a 18K hours Mandarin dataset and on a 2.8K hours English dataset. On assistant and dictation tasks, the approach achieves between 2.3-4.6% relative WER reduction (WERR) over…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Neural Networks and Applications
