The IDLAB VoxSRC-20 Submission: Large Margin Fine-Tuning and   Quality-Aware Score Calibration in DNN Based Speaker Verification

Jenthe Thienpondt; Brecht Desplanques; Kris Demuynck

arXiv:2010.11255·cs.SD·June 29, 2021

The IDLAB VoxSRC-20 Submission: Large Margin Fine-Tuning and Quality-Aware Score Calibration in DNN Based Speaker Verification

Jenthe Thienpondt, Brecht Desplanques, Kris Demuynck

PDF

TL;DR

This paper introduces a large margin fine-tuning approach and a quality-aware score calibration method for DNN-based speaker verification, significantly improving robustness and calibration accuracy, leading to state-of-the-art results in VoxCeleb challenges.

Contribution

It presents a novel large margin fine-tuning strategy and a quality-aware score calibration method that enhance speaker verification performance and robustness.

Findings

01

Achieved state-of-the-art results on VoxCeleb1 test sets.

02

Enhanced system robustness with longer training utterances.

03

Contributed to winning VoxCeleb Speaker Recognition Challenge 2020.

Abstract

In this paper we propose and analyse a large margin fine-tuning strategy and a quality-aware score calibration in text-independent speaker verification. Large margin fine-tuning is a secondary training stage for DNN based speaker verification systems trained with margin-based loss functions. It enables the network to create more robust speaker embeddings by enabling the use of longer training utterances in combination with a more aggressive margin penalty. Score calibration is a common practice in speaker verification systems to map output scores to well-calibrated log-likelihood-ratios, which can be converted to interpretable probabilities. By including quality features in the calibration system, the decision thresholds of the evaluation metrics become quality-dependent and more consistent across varying trial conditions. Applying both enhancements on the ECAPA-TDNN architecture leads…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.