Improving Mandarin End-to-End Speech Recognition with Word N-gram   Language Model

Jinchuan Tian; Jianwei Yu; Chao Weng; Yuexian Zou; and Dong Yu

arXiv:2201.01995·cs.CL·April 13, 2022

Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model

Jinchuan Tian, Jianwei Yu, Chao Weng, Yuexian Zou, and Dong Yu

PDF

1 Repo

TL;DR

This paper introduces a novel decoding algorithm for Mandarin E2E speech recognition that constructs word-level lattices on-the-fly, enabling effective integration of external word N-gram LMs and achieving state-of-the-art results.

Contribution

It proposes a new decoding method that constructs word-level lattices dynamically, allowing better use of external word-level language models in Mandarin ASR.

Findings

01

Outperforms subword-level LMs in experiments.

02

Achieves state-of-the-art CER on Aishell datasets.

03

Reduces CER by 14.8% on a large Mandarin dataset.

Abstract

Despite the rapid progress of end-to-end (E2E) automatic speech recognition (ASR), it has been shown that incorporating external language models (LMs) into the decoding can further improve the recognition performance of E2E ASR systems. To align with the modeling units adopted in E2E ASR systems, subword-level (e.g., characters, BPE) LMs are usually used to cooperate with current E2E ASR systems. However, the use of subword-level LMs will ignore the word-level information, which may limit the strength of the external LMs in E2E ASR. Although several methods have been proposed to incorporate word-level external LMs in E2E ASR, these methods are mainly designed for languages with clear word boundaries such as English and cannot be directly applied to languages like Mandarin, in which each character sequence can have multiple corresponding word sequences. To this end, we propose a novel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jctian98/e2e_lfmmi
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.