Minimum Risk Training for Neural Machine Translation

Shiqi Shen; Yong Cheng; Zhongjun He; Wei He; Hua Wu; Maosong Sun; and; Yang Liu

arXiv:1512.02433·cs.CL·June 16, 2016·43 cites

Minimum Risk Training for Neural Machine Translation

Shiqi Shen, Yong Cheng, Zhongjun He, Wei He, Hua Wu, Maosong Sun, and, Yang Liu

PDF

Open Access 1 Repo

TL;DR

This paper introduces minimum risk training for neural machine translation, allowing direct optimization of evaluation metrics and leading to significant improvements over traditional methods across multiple language pairs.

Contribution

It presents a novel training method that directly optimizes evaluation metrics, applicable to various neural network architectures in NLP.

Findings

01

Achieves significant translation quality improvements

02

Applicable across multiple language pairs

03

Compatible with different neural network architectures

Abstract

We propose minimum risk training for end-to-end neural machine translation. Unlike conventional maximum likelihood estimation, minimum risk training is capable of optimizing model parameters directly with respect to arbitrary evaluation metrics, which are not necessarily differentiable. Experiments show that our approach achieves significant improvements over maximum likelihood estimation on a state-of-the-art neural machine translation system across various languages pairs. Transparent to architectures, our approach can be applied to more neural networks and potentially benefit more NLP tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kh-kim/simple-nmt
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications