Domain adaptation based Speaker Recognition on Short Utterances

Ahilan Kanagasundaram; David Dean; Sridha Sridharan; Clinton Fookes

arXiv:1610.02831·cs.SD·October 12, 2016·1 cites

Domain adaptation based Speaker Recognition on Short Utterances

Ahilan Kanagasundaram, David Dean, Sridha Sridharan, Clinton Fookes

PDF

Open Access

TL;DR

This study investigates how domain adaptation techniques affect speaker recognition performance on short utterances, highlighting the diminishing benefits of in-domain models and proposing a modified IDV compensation method to mitigate domain mismatch.

Contribution

The paper introduces a novel modified IDV compensation method that improves out-domain PLDA performance, especially for short utterances, by addressing dataset mismatch.

Findings

01

In-domain PLDA outperforms out-domain PLDA on full-length utterances with over 28% improvement.

02

Performance gains from IDV compensation decrease as utterance length shortens due to phonetic variability.

03

IDV compensation yields 26% and 14% improvements on SWB and NIST datasets, respectively.

Abstract

This paper explores how the in- and out-domain probabilistic linear discriminant analysis (PLDA) speaker verification behave when enrolment and verification lengths are reduced. Experiment studies have found that when full-length utterance is used for evaluation, in-domain PLDA approach shows more than 28% improvement in EER and DCF values over out-domain PLDA approach and when short utterances are used for evaluation, the performance gain of in-domain speaker verification reduces at an increasing rate. Novel modified inter dataset variability (IDV) compensation is used to compensate the mismatch between in- and out-domain data and IDV-compensated out-domain PLDA shows respectively 26% and 14% improvement over out-domain PLDA speaker verification when SWB and NIST data are respectively used for S normalization. When the evaluation utterance length is reduced, the performance gain by IDV…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Natural Language Processing Techniques