Tackling the Score Shift in Cross-Lingual Speaker Verification by   Exploiting Language Information

Jenthe Thienpondt; Brecht Desplanques; Kris Demuynck

arXiv:2110.09150·eess.AS·June 22, 2022

Tackling the Score Shift in Cross-Lingual Speaker Verification by Exploiting Language Information

Jenthe Thienpondt, Brecht Desplanques, Kris Demuynck

PDF

TL;DR

This paper analyzes cross-lingual speaker verification challenges and proposes techniques to improve robustness by enhancing training with more cross-lingual samples and incorporating language info into calibration, leading to significant performance gains.

Contribution

It introduces two novel methods: an improved mini-batch sampling strategy during training and language-aware calibration, to address score shift in cross-lingual speaker verification.

Findings

01

11.7% relative performance improvement on VoxSRC-21 test set

02

Enhanced training increases intra-speaker cross-lingual sample representation

03

Language information integration improves calibration accuracy

Abstract

This paper contains a post-challenge performance analysis on cross-lingual speaker verification of the IDLab submission to the VoxCeleb Speaker Recognition Challenge 2021 (VoxSRC-21). We show that current speaker embedding extractors consistently underestimate speaker similarity in within-speaker cross-lingual trials. Consequently, the typical training and scoring protocols do not put enough emphasis on the compensation of intra-speaker language variability. We propose two techniques to increase cross-lingual speaker verification robustness. First, we enhance our previously proposed Large-Margin Fine-Tuning (LM-FT) training stage with a mini-batch sampling strategy which increases the amount of intra-speaker cross-lingual samples within the mini-batch. Second, we incorporate language information in the logistic regression calibration stage. We integrate quality metrics based on soft and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTest · Logistic Regression