Multi-task Metric Learning for Text-independent Speaker Verification

Yafeng Chen; Wu Guo; Jingjing Shi; Jiajun Qi; Tan Liu

arXiv:2010.10919·eess.AS·March 24, 2023

Multi-task Metric Learning for Text-independent Speaker Verification

Yafeng Chen, Wu Guo, Jingjing Shi, Jiajun Qi, Tan Liu

PDF

Open Access

TL;DR

This paper introduces a multi-task metric learning approach that combines cross-entropy and pair-based similarity loss to improve deep speaker embeddings for text-independent speaker verification, demonstrating effectiveness on the SITW dataset.

Contribution

The paper proposes a novel multi-task learning framework integrating metric learning with deep embedding training for speaker verification.

Findings

01

Improved speaker verification accuracy on SITW dataset

02

Effective combination of cross-entropy and metric learning losses

03

Enhanced discriminative power of speaker embeddings

Abstract

In this work, we introduce metric learning (ML) to enhance the deep embedding learning for text-independent speaker verification (SV). Specifically, the deep speaker embedding network is trained with conventional cross entropy loss and auxiliary pair-based ML loss function. For the auxiliary ML task, training samples of a mini-batch are first arranged into pairs, then positive and negative pairs are selected and weighted through their own and relative similarities, and finally the auxiliary ML loss is calculated by the similarity of the selected pairs. To evaluate the proposed method, we conduct experiments on the Speaker in the Wild (SITW) dataset. The results demonstrate the effectiveness of the proposed method.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing