Label-free Knowledge Distillation with Contrastive Loss for Light-weight Speaker Recognition
Zhiyuan Peng, Xuanji He, Ke Ding, Tan Lee, Guanglu Wan

TL;DR
This paper introduces a novel label-free knowledge distillation method using contrastive loss to enhance lightweight speaker recognition models, achieving superior performance without relying on speaker labels.
Contribution
It proposes a contrastive loss-based label-free knowledge distillation approach for lightweight speaker recognition models, addressing inefficiencies of traditional methods.
Findings
Outperforms conventional distillation methods
Significantly improves lightweight SR performance
Validated on diverse public speech datasets
Abstract
Very deep models for speaker recognition (SR) have demonstrated remarkable performance improvement in recent research. However, it is impractical to deploy these models for on-device applications with constrained computational resources. On the other hand, light-weight models are highly desired in practice despite their sub-optimal performance. This research aims to improve light-weight SR models through large-scale label-free knowledge distillation (KD). Existing KD approaches for SR typically require speaker labels to learn task-specific knowledge, due to the inefficiency of conventional loss for distillation. To address the inefficiency problem and achieve label-free KD, we propose to employ the contrastive loss from self-supervised learning for distillation. Extensive experiments are conducted on a collection of public speech datasets from diverse sources. Results on light-weight SR…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
