Tied Probabilistic Linear Discriminant Analysis for Speech Recognition

Liang Lu; Steve Renals

arXiv:1411.0895·cs.CL·June 23, 2015

Tied Probabilistic Linear Discriminant Analysis for Speech Recognition

Liang Lu, Steve Renals

PDF

TL;DR

This paper introduces a tied PLDA model for speech recognition that effectively manages model complexity, leading to improved accuracy over existing models on the Switchboard corpus.

Contribution

The paper extends PLDA mixture models with a tied approach to better control model size and prevent overfitting in speech recognition tasks.

Findings

01

Tied PLDA reduces word error rate compared to previous models.

02

Effective with both traditional and neural network features.

03

Demonstrates improved performance on the Switchboard corpus.

Abstract

Acoustic models using probabilistic linear discriminant analysis (PLDA) capture the correlations within feature vectors using subspaces which do not vastly expand the model. This allows high dimensional and correlated feature spaces to be used, without requiring the estimation of multiple high dimension covariance matrices. In this letter we extend the recently presented PLDA mixture model for speech recognition through a tied PLDA approach, which is better able to control the model size to avoid overfitting. We carried out experiments using the Switchboard corpus, with both mel frequency cepstral coefficient features and bottleneck feature derived from a deep neural network. Reductions in word error rate were obtained by using tied PLDA, compared with the PLDA mixture model, subspace Gaussian mixture models, and deep neural networks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.