Hierarchical Softmax for End-to-End Low-resource Multilingual Speech   Recognition

Qianying Liu; Zhuo Gong; Zhengdong Yang; Yuhang Yang; Sheng Li,; Chenchen Ding; Nobuaki Minematsu; Hao Huang; Fei Cheng; Chenhui Chu; Sadao; Kurohashi

arXiv:2204.03855·eess.AS·May 2, 2023

Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition

Qianying Liu, Zhuo Gong, Zhengdong Yang, Yuhang Yang, Sheng Li,, Chenchen Ding, Nobuaki Minematsu, Hao Huang, Fei Cheng, Chenhui Chu, Sadao, Kurohashi

PDF

Open Access 1 Repo

TL;DR

This paper introduces a multilingual hierarchical Softmax approach that leverages neighboring languages' similarities to improve low-resource speech recognition accuracy and efficiency.

Contribution

It proposes a novel hierarchical Softmax decoding method based on linguistic unit similarities across languages for low-resource speech recognition.

Findings

01

Improved recognition accuracy in low-resource settings

02

Enhanced decoding efficiency

03

Effective cross-lingual knowledge sharing

Abstract

Low-resource speech recognition has been long-suffering from insufficient training data. In this paper, we propose an approach that leverages neighboring languages to improve low-resource scenario performance, founded on the hypothesis that similar linguistic units in neighboring languages exhibit comparable term frequency distributions, which enables us to construct a Huffman tree for performing multilingual hierarchical Softmax decoding. This hierarchical structure enables cross-lingual knowledge sharing among similar tokens, thereby enhancing low-resource training outcomes. Empirical analyses demonstrate that our method is effective in improving the accuracy and efficiency of low-resource speech recognition.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Derek-Gong/hsoftmax
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing

MethodsSoftmax · Hierarchical Softmax