Exploring Binary Classification Loss For Speaker Verification

Bing Han; Zhengyang Chen; Yanmin Qian

arXiv:2307.08205·eess.AS·July 18, 2023

Exploring Binary Classification Loss For Speaker Verification

Bing Han, Zhengyang Chen, Yanmin Qian

PDF

Open Access 1 Repo

TL;DR

This paper introduces SphereFace2, a binary classifier-based framework for speaker verification that improves performance, robustness, and training-evaluation gap, especially on hard trials and noisy labels.

Contribution

It proposes a novel pair-wise binary classifier training paradigm for speaker verification, outperforming existing loss functions and enhancing robustness to label noise.

Findings

01

SphereFace2 outperforms existing loss functions on Voxceleb.

02

Large margin fine-tuning further improves SphereFace2.

03

SphereFace2 demonstrates robustness to noisy labels.

Abstract

The mismatch between close-set training and open-set testing usually leads to significant performance degradation for speaker verification task. For existing loss functions, metric learning-based objectives depend strongly on searching effective pairs which might hinder further improvements. And popular multi-classification methods are usually observed with degradation when evaluated on unseen speakers. In this work, we introduce SphereFace2 framework which uses several binary classifiers to train the speaker model in a pair-wise manner instead of performing multi-classification. Benefiting from this learning paradigm, it can efficiently alleviate the gap between training and evaluation. Experiments conducted on Voxceleb show that the SphereFace2 outperforms other existing loss functions, especially on hard trials. Besides, large margin fine-tuning strategy is proven to be compatible…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hunterhuan/sphereface2_speaker_verification
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing