Unsupervised Domain Adaptation for Speech Recognition via Uncertainty   Driven Self-Training

Sameer Khurana; Niko Moritz; Takaaki Hori; Jonathan Le Roux

arXiv:2011.13439·cs.CL·February 17, 2021

Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training

Sameer Khurana, Niko Moritz, Takaaki Hori, Jonathan Le Roux

PDF

TL;DR

This paper introduces DUST, an uncertainty-driven self-training method for unsupervised domain adaptation in speech recognition, significantly improving performance by filtering uncertain pseudo-labels.

Contribution

The paper proposes DUST, a dropout-based uncertainty measure for effective pseudo-label filtering in unsupervised domain adaptation for ASR.

Findings

01

DUST recovers up to 80% of the performance gap in domain adaptation.

02

Uncertainty filtering improves ASR accuracy over standard self-training.

03

Training time is reduced due to data filtering.

Abstract

The performance of automatic speech recognition (ASR) systems typically degrades significantly when the training and test data domains are mismatched. In this paper, we show that self-training (ST) combined with an uncertainty-based pseudo-label filtering approach can be effectively used for domain adaptation. We propose DUST, a dropout-based uncertainty-driven self-training technique which uses agreement between multiple predictions of an ASR system obtained for different dropout settings to measure the model's uncertainty about its prediction. DUST excludes pseudo-labeled data with high uncertainties from the training, which leads to substantially improved ASR results compared to ST without filtering, and accelerates the training time due to a reduced training data set. Domain adaptation experiments using WSJ as a source domain and TED-LIUM 3 as well as SWITCHBOARD as the target…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDropout