Future Word Contexts in Neural Network Language Models
Xie Chen, Xunying Liu, Anton Ragni, Yu Wang, Mark Gales

TL;DR
This paper introduces a novel neural network model, su-RNNLMs, that efficiently incorporates future word context for language modeling, outperforming traditional uni-RNNLMs and nearly matching bi-RNNLMs in speech recognition tasks.
Contribution
The paper proposes su-RNNLMs, a new neural network structure using a feedforward unit for future context, improving training efficiency and rescoring performance.
Findings
su-RNNLMs outperform uni-RNNLMs in speech recognition tasks.
su-RNNLMs nearly match bi-RNNLMs in N-best rescoring.
Lattice rescoring with su-RNNLMs improves overall recognition accuracy.
Abstract
Recently, bidirectional recurrent network language models (bi-RNNLMs) have been shown to outperform standard, unidirectional, recurrent neural network language models (uni-RNNLMs) on a range of speech recognition tasks. This indicates that future word context information beyond the word history can be useful. However, bi-RNNLMs pose a number of challenges as they make use of the complete previous and future word context information. This impacts both training efficiency and their use within a lattice rescoring framework. In this paper these issues are addressed by proposing a novel neural network structure, succeeding word RNNLMs (su-RNNLMs). Instead of using a recurrent unit to capture the complete future word contexts, a feedforward unit is used to model a finite number of succeeding, future, words. This model can be trained much more efficiently than bi-RNNLMs and can also be used…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
