Angular Softmax Loss for End-to-end Speaker Verification
Yutian Li, Feng Gao, Zhijian Ou, Jiasong Sun

TL;DR
This paper introduces the use of angular softmax loss in end-to-end speaker verification, significantly improving discriminative feature learning and system performance, especially under short utterance conditions.
Contribution
The paper pioneers applying A-softmax loss to speaker verification and demonstrates its effectiveness in enhancing discriminative features and overall system accuracy.
Findings
A-softmax loss reduces EER in speaker verification.
Combining A-softmax with PLDA scoring improves performance on short utterances.
Experiments on Fisher dataset show significant accuracy gains.
Abstract
End-to-end speaker verification systems have received increasing interests. The traditional i-vector approach trains a generative model (basically a factor-analysis model) to extract i-vectors as speaker embeddings. In contrast, the end-to-end approach directly trains a discriminative model (often a neural network) to learn discriminative speaker embeddings; a crucial component is the training criterion. In this paper, we use angular softmax (A-softmax), which is originally proposed for face verification, as the loss function for feature learning in end-to-end speaker verification. By introducing margins between classes into softmax loss, A-softmax can learn more discriminative features than softmax loss and triplet loss, and at the same time, is easy and stable for usage. We make two contributions in this work. 1) We introduce A-softmax loss into end-to-end speaker verification and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
