On the influence of language similarity in non-target speaker verification trials

Paul M. Reuter; Michael Jessen

arXiv:2506.02777·eess.AS·June 4, 2025

On the influence of language similarity in non-target speaker verification trials

Paul M. Reuter, Michael Jessen

PDF

Open Access

TL;DR

This study examines how language similarity affects non-target speaker verification performance across different languages, revealing minimal impact when training languages are involved and significant effects otherwise, especially with multilingual training.

Contribution

It provides new insights into the role of language similarity in cross-lingual speaker verification, highlighting the effects of training data language composition.

Findings

01

Score clustering occurs with training languages.

02

Language similarity influences scores in unseen languages.

03

Multilingual training enhances robustness across languages.

Abstract

In this paper, we investigate the influence of language similarity in cross-lingual non-target speaker verification trials using a state-of-the-art speaker verification system, ECAPA-TDNN, trained on multilingual and monolingual variants of the VoxCeleb dataset. Our analysis of the score distribution patterns on multilingual Globalphone and LDC CTS reveals a clustering effect in speaker comparisons involving a training language, whereby the choice of comparison language only minimally impacts scores. Conversely, we observe a language similarity effect in trials involving languages not included in the training set of the speaker verification system, with scores correlating with language similarity measured by a language classification system, especially when using multilingual training data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis

MethodsSparse Evolutionary Training