Unsupervised Domain Discovery using Latent Dirichlet Allocation for Acoustic Modelling in Speech Recognition
Mortaza Doulaty, Oscar Saz, Thomas Hain

TL;DR
This paper introduces an unsupervised method using Latent Dirichlet Allocation to discover domains in speech data, enabling the creation of domain-specific acoustic models that improve recognition accuracy.
Contribution
It proposes a novel LDA-based approach for unsupervised domain discovery in acoustic modelling, reducing reliance on manual domain labels.
Findings
LDA-based domain classification improves WER by up to 16%.
Unsupervised domain discovery outperforms pooled training.
MAP adaptation to LDA domains enhances speech recognition accuracy.
Abstract
Speech recognition systems are often highly domain dependent, a fact widely reported in the literature. However the concept of domain is complex and not bound to clear criteria. Hence it is often not evident if data should be considered to be out-of-domain. While both acoustic and language models can be domain specific, work in this paper concentrates on acoustic modelling. We present a novel method to perform unsupervised discovery of domains using Latent Dirichlet Allocation (LDA) modelling. Here a set of hidden domains is assumed to exist in the data, whereby each audio segment can be considered to be a weighted mixture of domain properties. The classification of audio segments into domains allows the creation of domain specific acoustic models for automatic speech recognition. Experiments are conducted on a dataset of diverse speech data covering speech from radio and TV broadcasts,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
