Domain adaptation based Speaker Recognition on Short Utterances
Ahilan Kanagasundaram, David Dean, Sridha Sridharan, Clinton Fookes

TL;DR
This study investigates how domain adaptation techniques affect speaker recognition performance on short utterances, highlighting the diminishing benefits of in-domain models and proposing a modified IDV compensation method to mitigate domain mismatch.
Contribution
The paper introduces a novel modified IDV compensation method that improves out-domain PLDA performance, especially for short utterances, by addressing dataset mismatch.
Findings
In-domain PLDA outperforms out-domain PLDA on full-length utterances with over 28% improvement.
Performance gains from IDV compensation decrease as utterance length shortens due to phonetic variability.
IDV compensation yields 26% and 14% improvements on SWB and NIST datasets, respectively.
Abstract
This paper explores how the in- and out-domain probabilistic linear discriminant analysis (PLDA) speaker verification behave when enrolment and verification lengths are reduced. Experiment studies have found that when full-length utterance is used for evaluation, in-domain PLDA approach shows more than 28% improvement in EER and DCF values over out-domain PLDA approach and when short utterances are used for evaluation, the performance gain of in-domain speaker verification reduces at an increasing rate. Novel modified inter dataset variability (IDV) compensation is used to compensate the mismatch between in- and out-domain data and IDV-compensated out-domain PLDA shows respectively 26% and 14% improvement over out-domain PLDA speaker verification when SWB and NIST data are respectively used for S normalization. When the evaluation utterance length is reduced, the performance gain by IDV…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Natural Language Processing Techniques
