Additive Phoneme-aware Margin Softmax Loss for Language Recognition

Zheng Li; Yan Liu; Lin Li; Qingyang Hong

arXiv:2106.12851·cs.SD·June 25, 2021

Additive Phoneme-aware Margin Softmax Loss for Language Recognition

Zheng Li, Yan Liu, Lin Li, Qingyang Hong

PDF

Open Access

TL;DR

This paper introduces an additive phoneme-aware margin softmax loss that dynamically adjusts margins based on phonetic information, improving language recognition accuracy over traditional fixed-margin methods.

Contribution

The paper presents a novel APM-Softmax loss that automatically tunes margins for each sample using phonetic recognition results, enhancing multi-task learning for language recognition.

Findings

01

Improved performance on Oriental Language Recognition datasets.

02

Outperforms traditional AM-Softmax and AAM-Softmax losses.

03

Demonstrates effectiveness across various testing conditions.

Abstract

This paper proposes an additive phoneme-aware margin softmax (APM-Softmax) loss to train the multi-task learning network with phonetic information for language recognition. In additive margin softmax (AM-Softmax) loss, the margin is set as a constant during the entire training for all training samples, and that is a suboptimal method since the recognition difficulty varies in training samples. In additive angular margin softmax (AAM-Softmax) loss, the additional angular margin is set as a costant as well. In this paper, we propose an APM-Softmax loss for language recognition with phoneitc multi-task learning, in which the additive phoneme-aware margin is automatically tuned for different training samples. More specifically, the margin of language recognition is adjusted according to the results of phoneme recognition. Experiments are reported on Oriental Language Recognition (OLR)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing