Large Margin Neural Language Model
Jiaji Huang, Yi Li, Wei Ping, Liang Huang

TL;DR
This paper introduces a large margin training criterion for neural language models, aiming to improve task-specific performance by enlarging the margin between good and bad sentences, outperforming traditional perplexity-based methods.
Contribution
It proposes a novel large margin formulation for neural language models that is end-to-end trainable and applicable to re-scoring tasks, surpassing minimum-PPL training.
Findings
Up to 1.1 WER reduction in speech recognition
Up to 1.0 BLEU increase in machine translation
Effective alternative to perplexity minimization
Abstract
We propose a large margin criterion for training neural language models. Conventionally, neural language models are trained by minimizing perplexity (PPL) on grammatical sentences. However, we demonstrate that PPL may not be the best metric to optimize in some tasks, and further propose a large margin formulation. The proposed method aims to enlarge the margin between the "good" and "bad" sentences in a task-specific sense. It is trained end-to-end and can be widely applied to tasks that involve re-scoring of generated text. Compared with minimum-PPL training, our method gains up to 1.1 WER reduction for speech recognition and 1.0 BLEU increase for machine translation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
