Unsupervised Domain Adaptation for Robust Speech Recognition via   Variational Autoencoder-Based Data Augmentation

Wei-Ning Hsu; Yu Zhang; James Glass

arXiv:1707.06265·cs.CL·September 25, 2017

Unsupervised Domain Adaptation for Robust Speech Recognition via Variational Autoencoder-Based Data Augmentation

Wei-Ning Hsu, Yu Zhang, James Glass

PDF

TL;DR

This paper introduces an unsupervised domain adaptation method for speech recognition using variational autoencoders to augment training data by transforming nuisance attributes, significantly improving robustness across domains.

Contribution

It proposes a novel VAE-based data augmentation technique that adapts speech models to new domains without requiring target domain transcripts.

Findings

01

Reduced WER by up to 35% on CHiME-4 dataset

02

Effective domain adaptation without target transcripts

03

Improved robustness in real-world speech recognition

Abstract

Domain mismatch between training and testing can lead to significant degradation in performance in many machine learning scenarios. Unfortunately, this is not a rare situation for automatic speech recognition deployments in real-world applications. Research on robust speech recognition can be regarded as trying to overcome this domain mismatch issue. In this paper, we address the unsupervised domain adaptation problem for robust speech recognition, where both source and target domain speech are presented, but word transcripts are only available for the source domain speech. We present novel augmentation-based methods that transform speech in a way that does not change the transcripts. Specifically, we first train a variational autoencoder on both source and target domain data (without supervision) to learn a latent representation of speech. We then transform nuisance attributes of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSolana Customer Service Number +1-833-534-1729