Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation

Boxuan Lyu; Hidetaka Kamigaito; Kotaro Funakoshi; Manabu Okumura

arXiv:2406.11632·cs.CL·May 27, 2025

Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation

Boxuan Lyu, Hidetaka Kamigaito, Kotaro Funakoshi, Manabu Okumura

PDF

Open Access 1 Video

TL;DR

This paper introduces source-based MBR decoding for neural machine translation, which uses quasi-sources and a reference-free quality metric to improve translation quality over traditional methods.

Contribution

It proposes the first source-only MBR decoding method utilizing quasi-sources and a quality estimation metric, outperforming existing reranking and MBR techniques.

Findings

01

sMBR outperforms QE reranking

02

sMBR surpasses standard MBR decoding

03

Source-based approach improves translation quality

Abstract

Maximum a posteriori decoding, a commonly used method for neural machine translation (NMT), aims to maximize the estimated posterior probability. However, high estimated probability does not always lead to high translation quality. Minimum Bayes Risk (MBR) decoding offers an alternative by seeking hypotheses with the highest expected utility. Inspired by Quality Estimation (QE) reranking which uses the QE model as a ranker we propose source-based MBR (sMBR) decoding, a novel approach that utilizes quasi-sources (generated via paraphrasing or back-translation) as ``support hypotheses'' and a reference-free quality estimation metric as the utility function, marking the first work to solely use sources in MBR decoding. Experiments show that sMBR outperforms QE reranking and the standard MBR decoding. Our findings suggest that sMBR is a promising approach for NMT decoding.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation· underline

Taxonomy

TopicsNatural Language Processing Techniques