Future Vector Enhanced LSTM Language Model for LVCSR
Qi Liu, Yanmin Qian, Kai Yu

TL;DR
This paper introduces a future vector enhanced LSTM language model for LVCSR that incorporates future sequence information to improve long-term sequence prediction and speech recognition accuracy.
Contribution
It proposes a novel LSTM LM using future vectors to better model long-term dependencies in speech recognition tasks.
Findings
Improved BLEU scores for long-term sequence prediction.
Slight gains in speech recognition rescoring performance.
Significant WER reduction when combined with conventional LSTM LMs.
Abstract
Language models (LM) play an important role in large vocabulary continuous speech recognition (LVCSR). However, traditional language models only predict next single word with given history, while the consecutive predictions on a sequence of words are usually demanded and useful in LVCSR. The mismatch between the single word prediction modeling in trained and the long term sequence prediction in read demands may lead to the performance degradation. In this paper, a novel enhanced long short-term memory (LSTM) LM using the future vector is proposed. In addition to the given history, the rest of the sequence will be also embedded by future vectors. This future vector can be incorporated with the LSTM LM, so it has the ability to model much longer term sequence level information. Experiments show that, the proposed new LSTM LM gets a better result on BLEU scores for long term sequence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Music and Audio Processing
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory
