Approximations to the MMI criterion and their effect on lattice-based   MMI

Steven Wegmann

arXiv:1002.0773·cs.CL·February 4, 2010

Approximations to the MMI criterion and their effect on lattice-based MMI

Steven Wegmann

PDF

Open Access

TL;DR

This paper analyzes lattice-based MMI in speech recognition, revealing its poor convergence behavior is due to approximation issues rather than overfitting, and proposes modifications to improve its stability without losing accuracy.

Contribution

It provides a detailed analysis of lattice-based MMI's convergence issues and introduces methodological modifications to enhance its stability and effectiveness.

Findings

01

Lattice-based MMI does not truly converge asymptotically.

02

Overfitting is not the cause of performance degradation.

03

Modified methodology improves convergence without losing accuracy.

Abstract

Maximum mutual information (MMI) is a model selection criterion used for hidden Markov model (HMM) parameter estimation that was developed more than twenty years ago as a discriminative alternative to the maximum likelihood criterion for HMM-based speech recognition. It has been shown in the speech recognition literature that parameter estimation using the current MMI paradigm, lattice-based MMI, consistently outperforms maximum likelihood estimation, but this is at the expense of undesirable convergence properties. In particular, recognition performance is sensitive to the number of times that the iterative MMI estimation algorithm, extended Baum-Welch, is performed. In fact, too many iterations of extended Baum-Welch will lead to degraded performance, despite the fact that the MMI criterion improves at each iteration. This phenomenon is at variance with the analogous behavior of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Bayesian Methods and Mixture Models