Transfer learning of language-independent end-to-end ASR with language   model fusion

Hirofumi Inaguma; Jaejin Cho; Murali Karthick Baskar; Tatsuya; Kawahara; Shinji Watanabe

arXiv:1811.02134·cs.CL·May 8, 2019·6 cites

Transfer learning of language-independent end-to-end ASR with language model fusion

Hirofumi Inaguma, Jaejin Cho, Murali Karthick Baskar, Tatsuya, Kawahara, Shinji Watanabe

PDF

Open Access

TL;DR

This paper presents a transfer learning approach for low-resource language ASR that integrates external language models during adaptation, significantly improving performance across multiple languages.

Contribution

It introduces a language-independent end-to-end ASR system with LM fusion transfer, enhancing adaptation effectiveness for low-resource languages.

Findings

01

LM fusion transfer outperforms simple transfer learning

02

Significant reduction in performance gap compared to hybrid systems

03

Effective use of external text data improves results

Abstract

This work explores better adaptation methods to low-resource languages using an external language model (LM) under the framework of transfer learning. We first build a language-independent ASR system in a unified sequence-to-sequence (S2S) architecture with a shared vocabulary among all languages. During adaptation, we perform LM fusion transfer, where an external LM is integrated into the decoder network of the attention-based S2S model in the whole adaptation stage, to effectively incorporate linguistic context of the target language. We also investigate various seed models for transfer learning. Experimental evaluations using the IARPA BABEL data set show that LM fusion transfer improves performances on all target five languages compared with simple transfer learning when the external text data is available. Our final system drastically reduces the performance gap from the hybrid…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Topic Modeling