Neural Machine Translation by Minimising the Bayes-risk with Respect to   Syntactic Translation Lattices

Felix Stahlberg; Adri\`a de Gispert; Eva Hasler; Bill Byrne

arXiv:1612.03791·cs.CL·February 14, 2017

Neural Machine Translation by Minimising the Bayes-risk with Respect to Syntactic Translation Lattices

Felix Stahlberg, Adri\`a de Gispert, Eva Hasler, Bill Byrne

PDF

TL;DR

This paper introduces a novel method that integrates Bayes-risk minimization with neural machine translation, enhancing translation quality by combining NMT scores with SMT lattice-based risk, applicable at word and subword levels.

Contribution

It presents a flexible, efficient approach to incorporate risk estimation into NMT decoding, surpassing traditional rescoring methods and enabling the generation of new hypotheses.

Findings

01

Significant improvements over lattice rescoring on multiple datasets.

02

Applicable to both word-level and subword-level NMT.

03

Produces novel translation hypotheses beyond standard rescoring.

Abstract

We present a novel scheme to combine neural machine translation (NMT) with traditional statistical machine translation (SMT). Our approach borrows ideas from linearised lattice minimum Bayes-risk decoding for SMT. The NMT score is combined with the Bayes-risk of the translation according the SMT lattice. This makes our approach much more flexible than $n$ -best list or lattice rescoring as the neural decoder is not restricted to the SMT search space. We show an efficient and simple way to integrate risk estimation into the NMT decoder which is suitable for word-level as well as subword-unit-level NMT. We test our method on English-German and Japanese-English and report significant gains over lattice rescoring on several data sets for both single and ensembled NMT. The MBR decoder produces entirely new hypotheses far beyond simply rescoring the SMT search space or fixing UNKs in the NMT…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.