Language modeling with Neural trans-dimensional random fields
Bin Wang, Zhijian Ou

TL;DR
This paper introduces neural trans-dimensional random field language models that leverage neural networks for nonlinear potentials, achieving improved performance and efficiency over previous models and LSTM LMs.
Contribution
It proposes neural TRFs with deep CNN potentials and a novel training strategy, combining neural network advantages with efficient inference in TRF frameworks.
Findings
Neural TRFs outperform discrete TRFs in speech recognition tasks.
Neural TRFs slightly outperform LSTM LMs with fewer parameters.
Neural TRFs offer 16x faster inference than LSTM models.
Abstract
Trans-dimensional random field language models (TRF LMs) have recently been introduced, where sentences are modeled as a collection of random fields. The TRF approach has been shown to have the advantages of being computationally more efficient in inference than LSTM LMs with close performance and being able to flexibly integrating rich features. In this paper we propose neural TRFs, beyond of the previous discrete TRFs that only use linear potentials with discrete features. The idea is to use nonlinear potentials with continuous features, implemented by neural networks (NNs), in the TRF framework. Neural TRFs combine the advantages of both NNs and TRFs. The benefits of word embedding, nonlinear feature learning and larger context modeling are inherited from the use of NNs. At the same time, the strength of efficient inference by avoiding expensive softmax is preserved. A number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Topic Modeling
MethodsSigmoid Activation · Tanh Activation · Softmax · Long Short-Term Memory
