Real Additive Margin Softmax for Speaker Verification
Lantian Li, Ruiqian Nai, Dong Wang

TL;DR
This paper analyzes the additive margin softmax loss in speaker verification, revealing it does not implement true max-margin training, and proposes a corrected version that improves performance on multiple datasets.
Contribution
The paper introduces a true margin function into AM-Softmax, providing a more accurate max-margin training approach for speaker verification.
Findings
Corrected AM-Softmax outperforms original on VoxCeleb1, SITW, CNCeleb
Analysis shows original AM-Softmax does not implement real max-margin training
Proposed method improves speaker verification accuracy
Abstract
The additive margin softmax (AM-Softmax) loss has delivered remarkable performance in speaker verification. A supposed behavior of AM-Softmax is that it can shrink within-class variation by putting emphasis on target logits, which in turn improves margin between target and non-target classes. In this paper, we conduct a careful analysis on the behavior of AM-Softmax loss, and show that this loss does not implement real max-margin training. Based on this observation, we present a Real AM-Softmax loss which involves a true margin function in the softmax training. Experiments conducted on VoxCeleb1, SITW and CNCeleb demonstrated that the corrected AM-Softmax loss consistently outperforms the original one. The code has been released at https://gitlab.com/csltstu/sunine.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Natural Language Processing Techniques
MethodsSoftmax
