On the influence of language similarity in non-target speaker verification trials
Paul M. Reuter, Michael Jessen

TL;DR
This study examines how language similarity affects non-target speaker verification performance across different languages, revealing minimal impact when training languages are involved and significant effects otherwise, especially with multilingual training.
Contribution
It provides new insights into the role of language similarity in cross-lingual speaker verification, highlighting the effects of training data language composition.
Findings
Score clustering occurs with training languages.
Language similarity influences scores in unseen languages.
Multilingual training enhances robustness across languages.
Abstract
In this paper, we investigate the influence of language similarity in cross-lingual non-target speaker verification trials using a state-of-the-art speaker verification system, ECAPA-TDNN, trained on multilingual and monolingual variants of the VoxCeleb dataset. Our analysis of the score distribution patterns on multilingual Globalphone and LDC CTS reveals a clustering effect in speaker comparisons involving a training language, whereby the choice of comparison language only minimally impacts scores. Conversely, we observe a language similarity effect in trials involving languages not included in the training set of the speaker verification system, with scores correlating with language similarity measured by a language classification system, especially when using multilingual training data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis
MethodsSparse Evolutionary Training
