Pushing the Frontiers of Self-Distillation Prototypes Network with Dimension Regularization and Score Normalization

Yafeng Chen; Chong Deng; Hui Wang; Yiheng Jiang; Han Yin; Qian Chen; Wen Wang

arXiv:2505.13826·eess.AS·May 21, 2025

Pushing the Frontiers of Self-Distillation Prototypes Network with Dimension Regularization and Score Normalization

Yafeng Chen, Chong Deng, Hui Wang, Yiheng Jiang, Han Yin, Qian Chen, Wen Wang

PDF

Open Access 1 Repo

TL;DR

This paper improves self-supervised speaker verification by introducing dimension regularization and score normalization to the SDPN framework, significantly narrowing the performance gap with supervised methods.

Contribution

It proposes a novel combination of dimension regularization and score normalization within SDPN, achieving state-of-the-art results in self-supervised speaker verification.

Findings

01

Achieved state-of-the-art EER on VoxCeleb1 benchmark

02

Improved self-supervised SV performance by over 20%

03

Effectively addressed embedding collapse with regularization

Abstract

Developing robust speaker verification (SV) systems without speaker labels has been a longstanding challenge. Earlier research has highlighted a considerable performance gap between self-supervised and fully supervised approaches. In this paper, we enhance the non-contrastive self-supervised framework, Self-Distillation Prototypes Network (SDPN), by introducing dimension regularization that explicitly addresses the collapse problem through the application of regularization terms to speaker embeddings. Moreover, we integrate score normalization techniques from fully supervised SV to further bridge the gap toward supervised verification performance. SDPN with dimension regularization and score normalization sets a new state-of-the-art on the VoxCeleb1 speaker verification evaluation benchmark, achieving Equal Error Rate 1.29%, 1.60%, and 2.80% for trial VoxCeleb1-{O,E,H} respectively.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

modelscope/3D-Speaker
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications