The IDLAB VoxSRC-20 Submission: Large Margin Fine-Tuning and Quality-Aware Score Calibration in DNN Based Speaker Verification
Jenthe Thienpondt, Brecht Desplanques, Kris Demuynck

TL;DR
This paper introduces a large margin fine-tuning approach and a quality-aware score calibration method for DNN-based speaker verification, significantly improving robustness and calibration accuracy, leading to state-of-the-art results in VoxCeleb challenges.
Contribution
It presents a novel large margin fine-tuning strategy and a quality-aware score calibration method that enhance speaker verification performance and robustness.
Findings
Achieved state-of-the-art results on VoxCeleb1 test sets.
Enhanced system robustness with longer training utterances.
Contributed to winning VoxCeleb Speaker Recognition Challenge 2020.
Abstract
In this paper we propose and analyse a large margin fine-tuning strategy and a quality-aware score calibration in text-independent speaker verification. Large margin fine-tuning is a secondary training stage for DNN based speaker verification systems trained with margin-based loss functions. It enables the network to create more robust speaker embeddings by enabling the use of longer training utterances in combination with a more aggressive margin penalty. Score calibration is a common practice in speaker verification systems to map output scores to well-calibrated log-likelihood-ratios, which can be converted to interpretable probabilities. By including quality features in the calibration system, the decision thresholds of the evaluation metrics become quality-dependent and more consistent across varying trial conditions. Applying both enhancements on the ECAPA-TDNN architecture leads…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
