Adversarial Training for Multi-domain Speaker Recognition
Qing Wang, Wei Rao, Pengcheng Guo, Lei Xie

TL;DR
This paper introduces an adversarial training approach to improve multi-domain speaker recognition by creating domain-invariant and speaker-discriminative speech representations, effectively addressing domain mismatch and dataset variance issues.
Contribution
The study proposes a novel adversarial training method specifically designed for multi-domain speaker recognition, outperforming existing unsupervised domain adaptation techniques.
Findings
Effective in reducing multi-domain mismatch
Outperforms existing unsupervised domain adaptation methods
Produces robust speaker representations
Abstract
In real-life applications, the performance of speaker recognition systems always degrades when there is a mismatch between training and evaluation data. Many domain adaptation methods have been successfully used for eliminating the domain mismatches in speaker recognition. However, usually both training and evaluation data themselves can be composed of several subsets. These inner variances of each dataset can also be considered as different domains. Different distributed subsets in source or target domain dataset can also cause multi-domain mismatches, which are influential to speaker recognition performance. In this study, we propose to use adversarial training for multi-domain speaker recognition to solve the domain mismatch and the dataset variance problems. By adopting the proposed method, we are able to obtain both multi-domain-invariant and speaker-discriminative speech…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
