Minimum Risk Training for Neural Machine Translation
Shiqi Shen, Yong Cheng, Zhongjun He, Wei He, Hua Wu, Maosong Sun, and, Yang Liu

TL;DR
This paper introduces minimum risk training for neural machine translation, allowing direct optimization of evaluation metrics and leading to significant improvements over traditional methods across multiple language pairs.
Contribution
It presents a novel training method that directly optimizes evaluation metrics, applicable to various neural network architectures in NLP.
Findings
Achieves significant translation quality improvements
Applicable across multiple language pairs
Compatible with different neural network architectures
Abstract
We propose minimum risk training for end-to-end neural machine translation. Unlike conventional maximum likelihood estimation, minimum risk training is capable of optimizing model parameters directly with respect to arbitrary evaluation metrics, which are not necessarily differentiable. Experiments show that our approach achieves significant improvements over maximum likelihood estimation on a state-of-the-art neural machine translation system across various languages pairs. Transparent to architectures, our approach can be applied to more neural networks and potentially benefit more NLP tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
