Learning to Stop in Structured Prediction for Neural Machine Translation

Mingbo Ma; Renjie Zheng; Liang Huang

arXiv:1904.01032·cs.CL·June 26, 2019·1 cites

Learning to Stop in Structured Prediction for Neural Machine Translation

Mingbo Ma, Renjie Zheng, Liang Huang

PDF

Open Access

TL;DR

This paper introduces a new ranking method and a structured prediction loss for neural machine translation, enabling optimal stopping criteria during beam search and improving translation quality and length accuracy.

Contribution

It proposes a novel ranking approach and a structured loss function that address the lack of principled stopping criteria in beam search for neural machine translation.

Findings

01

Improved BLEU scores on German-English and Chinese-English translation tasks.

02

Better length control in translated outputs.

03

Enhanced stopping criteria leading to more accurate translations.

Abstract

Beam search optimization resolves many issues in neural machine translation. However, this method lacks principled stopping criteria and does not learn how to stop during training, and the model naturally prefers the longer hypotheses during the testing time in practice since they use the raw score instead of the probability-based score. We propose a novel ranking method which enables an optimal beam search stopping criteria. We further introduce a structured prediction loss function which penalizes suboptimal finished candidates produced by beam search during training. Experiments of neural machine translation on both synthetic data and real languages (German-to-English and Chinese-to-English) demonstrate our proposed methods lead to better length and BLEU score.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications