Learning neural trans-dimensional random field language models with   noise-contrastive estimation

Bin Wang; Zhijian Ou

arXiv:1710.10739·cs.CL·October 31, 2017

Learning neural trans-dimensional random field language models with noise-contrastive estimation

Bin Wang, Zhijian Ou

PDF

Open Access

TL;DR

This paper introduces improved training techniques for neural trans-dimensional random field language models, combining exponential tilting, noise-contrastive estimation, and deep neural networks to enhance scalability and performance in speech recognition.

Contribution

The paper proposes novel reformulations and estimation methods for neural TRF LMs, significantly boosting training efficiency and accuracy over previous approaches.

Findings

01

Achieved 40x larger training set with only 1/3 training time

02

Reduced word error rate by 4.7% relative over strong LSTM baseline

03

Enhanced neural TRF LMs with deep CNN and bidirectional LSTM features

Abstract

Trans-dimensional random field language models (TRF LMs) where sentences are modeled as a collection of random fields, have shown close performance with LSTM LMs in speech recognition and are computationally more efficient in inference. However, the training efficiency of neural TRF LMs is not satisfactory, which limits the scalability of TRF LMs on large training corpus. In this paper, several techniques on both model formulation and parameter estimation are proposed to improve the training efficiency and the performance of neural TRF LMs. First, TRFs are reformulated in the form of exponential tilting of a reference distribution. Second, noise-contrastive estimation (NCE) is introduced to jointly estimate the model parameters and normalization constants. Third, we extend the neural TRF LMs by marrying the deep convolutional neural network (CNN) and the bidirectional LSTM into the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory