Adversarial Training for Multi-domain Speaker Recognition

Qing Wang; Wei Rao; Pengcheng Guo; Lei Xie

arXiv:2011.08623·cs.SD·November 18, 2020·1 cites

Adversarial Training for Multi-domain Speaker Recognition

Qing Wang, Wei Rao, Pengcheng Guo, Lei Xie

PDF

Open Access

TL;DR

This paper introduces an adversarial training approach to improve multi-domain speaker recognition by creating domain-invariant and speaker-discriminative speech representations, effectively addressing domain mismatch and dataset variance issues.

Contribution

The study proposes a novel adversarial training method specifically designed for multi-domain speaker recognition, outperforming existing unsupervised domain adaptation techniques.

Findings

01

Effective in reducing multi-domain mismatch

02

Outperforms existing unsupervised domain adaptation methods

03

Produces robust speaker representations

Abstract

In real-life applications, the performance of speaker recognition systems always degrades when there is a mismatch between training and evaluation data. Many domain adaptation methods have been successfully used for eliminating the domain mismatches in speaker recognition. However, usually both training and evaluation data themselves can be composed of several subsets. These inner variances of each dataset can also be considered as different domains. Different distributed subsets in source or target domain dataset can also cause multi-domain mismatches, which are influential to speaker recognition performance. In this study, we propose to use adversarial training for multi-domain speaker recognition to solve the domain mismatch and the dataset variance problems. By adopting the proposed method, we are able to obtain both multi-domain-invariant and speaker-discriminative speech…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing