Learning Hidden Unit Contributions for Unsupervised Acoustic Model   Adaptation

Pawel Swietojanski; Jinyu Li; Steve Renals

arXiv:1601.02828·cs.CL·July 14, 2016

Learning Hidden Unit Contributions for Unsupervised Acoustic Model Adaptation

Pawel Swietojanski, Jinyu Li, Steve Renals

PDF

TL;DR

This paper introduces LHUC, a method for unsupervised speaker and environment adaptation of neural network acoustic models, which improves speech recognition accuracy across diverse benchmarks without complex speaker-specific components.

Contribution

The paper extends LHUC to a speaker adaptive training framework, enabling more flexible adaptation without auxiliary feature extractors or structural changes.

Findings

01

LHUC achieves 5-23% relative WER reduction across benchmarks.

02

The method works with limited adaptation data and in one-shot scenarios.

03

LHUC complements other adaptation techniques effectively.

Abstract

This work presents a broad study on the adaptation of neural network acoustic models by means of learning hidden unit contributions (LHUC) -- a method that linearly re-combines hidden units in a speaker- or environment-dependent manner using small amounts of unsupervised adaptation data. We also extend LHUC to a speaker adaptive training (SAT) framework that leads to a more adaptable DNN acoustic model, working both in a speaker-dependent and a speaker-independent manner, without the requirements to maintain auxiliary speaker-dependent feature extractors or to introduce significant speaker-dependent changes to the DNN structure. Through a series of experiments on four different speech recognition benchmarks (TED talks, Switchboard, AMI meetings, and Aurora4) comprising 270 test speakers, we show that LHUC in both its test-only and SAT variants results in consistent word error rate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.