L-Vector: Neural Label Embedding for Domain Adaptation

Zhong Meng; Hu Hu; Jinyu Li; Changliang Liu; Yan Huang; Yifan Gong,; Chin-Hui Lee

arXiv:2004.13480·eess.AS·April 29, 2020·1 cites

L-Vector: Neural Label Embedding for Domain Adaptation

Zhong Meng, Hu Hu, Jinyu Li, Changliang Liu, Yan Huang, Yifan Gong,, Chin-Hui Lee

PDF

Open Access

TL;DR

This paper introduces a neural label embedding method for domain adaptation in acoustic models, effectively transferring knowledge from source to target domains without requiring paired data, resulting in significant WER improvements.

Contribution

It presents a novel label embedding scheme that distills source model knowledge into label vectors, enabling effective unsupervised domain adaptation for speech recognition.

Findings

01

Achieved up to 14.1% relative WER reduction

02

Effective without paired target-source data

03

Applicable to large-scale multi-conditional models

Abstract

We propose a novel neural label embedding (NLE) scheme for the domain adaptation of a deep neural network (DNN) acoustic model with unpaired data samples from source and target domains. With NLE method, we distill the knowledge from a powerful source-domain DNN into a dictionary of label embeddings, or l-vectors, one for each senone class. Each l-vector is a representation of the senone-specific output distributions of the source-domain DNN and is learned to minimize the average L2, Kullback-Leibler (KL) or symmetric KL distance to the output vectors with the same label through simple averaging or standard back-propagation. During adaptation, the l-vectors serve as the soft targets to train the target-domain model with cross-entropy loss. Without parallel data constraint as in the teacher-student learning, NLE is specially suited for the situation where the paired target-domain data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing