Reward Optimization for Neural Machine Translation with Learned Metrics
Raphael Shu, Kang Min Yoo, Jung-Woo Ha

TL;DR
This paper explores optimizing neural machine translation models using learned, model-based metrics like BLEURT, demonstrating significant improvements in translation quality and human judgment compared to traditional BLEU-based training.
Contribution
It introduces a contrastive-margin loss for stable reward optimization with learned metrics and shows its effectiveness in improving translation quality over smoothed BLEU.
Findings
BLEURT-based training significantly increases metric scores.
Models trained with BLEURT show improved adequacy and coverage.
Human evaluations favor BLEURT-optimized models.
Abstract
Neural machine translation (NMT) models are conventionally trained with token-level negative log-likelihood (NLL), which does not guarantee that the generated translations will be optimized for a selected sequence-level evaluation metric. Multiple approaches are proposed to train NMT with BLEU as the reward, in order to directly improve the metric. However, it was reported that the gain in BLEU does not translate to real quality improvement, limiting the application in industry. Recently, it became clear to the community that BLEU has a low correlation with human judgment when dealing with state-of-the-art models. This leads to the emerging of model-based evaluation metrics. These new metrics are shown to have a much higher human correlation. In this paper, we investigate whether it is beneficial to optimize NMT models with the state-of-the-art model-based metric, BLEURT. We propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
