Large Margin Neural Language Model

Jiaji Huang; Yi Li; Wei Ping; Liang Huang

arXiv:1808.08987·cs.CL·August 29, 2018

Large Margin Neural Language Model

Jiaji Huang, Yi Li, Wei Ping, Liang Huang

PDF

Open Access

TL;DR

This paper introduces a large margin training criterion for neural language models, aiming to improve task-specific performance by enlarging the margin between good and bad sentences, outperforming traditional perplexity-based methods.

Contribution

It proposes a novel large margin formulation for neural language models that is end-to-end trainable and applicable to re-scoring tasks, surpassing minimum-PPL training.

Findings

01

Up to 1.1 WER reduction in speech recognition

02

Up to 1.0 BLEU increase in machine translation

03

Effective alternative to perplexity minimization

Abstract

We propose a large margin criterion for training neural language models. Conventionally, neural language models are trained by minimizing perplexity (PPL) on grammatical sentences. However, we demonstrate that PPL may not be the best metric to optimize in some tasks, and further propose a large margin formulation. The proposed method aims to enlarge the margin between the "good" and "bad" sentences in a task-specific sense. It is trained end-to-end and can be widely applied to tasks that involve re-scoring of generated text. Compared with minimum-PPL training, our method gains up to 1.1 WER reduction for speech recognition and 1.0 BLEU increase for machine translation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis