Latent Dirichlet Allocation Based Organisation of Broadcast Media Archives for Deep Neural Network Adaptation
Mortaza Doulaty, Oscar Saz, Raymond W. M. Ng, Thomas Hain

TL;DR
This paper introduces an unsupervised method using Latent Dirichlet Allocation to discover latent domains in broadcast media, improving DNN adaptation for speech recognition and reducing error rates.
Contribution
It presents a novel approach combining LDA with UBIC representation for DNN adaptation in speech recognition of diverse broadcast media.
Findings
LDA-UBIC DNNs reduce recognition error by up to 13% relative.
Unsupervised latent domain discovery improves acoustic modeling.
Method tested on BBC broadcasts with significant performance gains.
Abstract
This paper presents a new method for the discovery of latent domains in diverse speech data, for the use of adaptation of Deep Neural Networks (DNNs) for Automatic Speech Recognition. Our work focuses on transcription of multi-genre broadcast media, which is often only categorised broadly in terms of high level genres such as sports, news, documentary, etc. However, in terms of acoustic modelling these categories are coarse. Instead, it is expected that a mixture of latent domains can better represent the complex and diverse behaviours within a TV show, and therefore lead to better and more robust performance. We propose a new method, whereby these latent domains are discovered with Latent Dirichlet Allocation, in an unsupervised manner. These are used to adapt DNNs using the Unique Binary Code (UBIC) representation for the LDA domains. Experiments conducted on a set of BBC TV…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Discriminant Analysis
