Real Additive Margin Softmax for Speaker Verification

Lantian Li; Ruiqian Nai; Dong Wang

arXiv:2110.09116·cs.SD·October 19, 2021·5 cites

Real Additive Margin Softmax for Speaker Verification

Lantian Li, Ruiqian Nai, Dong Wang

PDF

Open Access 1 Repo

TL;DR

This paper analyzes the additive margin softmax loss in speaker verification, revealing it does not implement true max-margin training, and proposes a corrected version that improves performance on multiple datasets.

Contribution

The paper introduces a true margin function into AM-Softmax, providing a more accurate max-margin training approach for speaker verification.

Findings

01

Corrected AM-Softmax outperforms original on VoxCeleb1, SITW, CNCeleb

02

Analysis shows original AM-Softmax does not implement real max-margin training

03

Proposed method improves speaker verification accuracy

Abstract

The additive margin softmax (AM-Softmax) loss has delivered remarkable performance in speaker verification. A supposed behavior of AM-Softmax is that it can shrink within-class variation by putting emphasis on target logits, which in turn improves margin between target and non-target classes. In this paper, we conduct a careful analysis on the behavior of AM-Softmax loss, and show that this loss does not implement real max-margin training. Based on this observation, we present a Real AM-Softmax loss which involves a true margin function in the softmax training. Experiments conducted on VoxCeleb1, SITW and CNCeleb demonstrated that the corrected AM-Softmax loss consistently outperforms the original one. The code has been released at https://gitlab.com/csltstu/sunine.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://gitlab.com/csltstu/sunine
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Natural Language Processing Techniques

MethodsSoftmax