Future Vector Enhanced LSTM Language Model for LVCSR

Qi Liu; Yanmin Qian; Kai Yu

arXiv:2008.01832·eess.AS·August 6, 2020·1 cites

Future Vector Enhanced LSTM Language Model for LVCSR

Qi Liu, Yanmin Qian, Kai Yu

PDF

Open Access

TL;DR

This paper introduces a future vector enhanced LSTM language model for LVCSR that incorporates future sequence information to improve long-term sequence prediction and speech recognition accuracy.

Contribution

It proposes a novel LSTM LM using future vectors to better model long-term dependencies in speech recognition tasks.

Findings

01

Improved BLEU scores for long-term sequence prediction.

02

Slight gains in speech recognition rescoring performance.

03

Significant WER reduction when combined with conventional LSTM LMs.

Abstract

Language models (LM) play an important role in large vocabulary continuous speech recognition (LVCSR). However, traditional language models only predict next single word with given history, while the consecutive predictions on a sequence of words are usually demanded and useful in LVCSR. The mismatch between the single word prediction modeling in trained and the long term sequence prediction in read demands may lead to the performance degradation. In this paper, a novel enhanced long short-term memory (LSTM) LM using the future vector is proposed. In addition to the given history, the rest of the sequence will be also embedded by future vectors. This future vector can be incorporated with the LSTM LM, so it has the ability to model much longer term sequence level information. Experiments show that, the proposed new LSTM LM gets a better result on BLEU scores for long term sequence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Music and Audio Processing

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory