Better Document-Level Machine Translation with Bayes' Rule
Lei Yu, Laurent Sartran, Wojciech Stokowiec, Wang Ling, Lingpeng Kong,, Phil Blunsom, Chris Dyer

TL;DR
This paper introduces a novel document translation model based on Bayes' rule that leverages monolingual documents and parallel sentences, using a language model prior and reverse translation probability to improve translation quality.
Contribution
The paper proposes a new Bayes' rule-based framework for document translation that effectively incorporates cross-sentence context and can be trained with limited parallel data.
Findings
Outperforms existing document translation methods
Utilizes cross-sentence context for better accuracy
Enables efficient inference with a beam-search algorithm
Abstract
We show that Bayes' rule provides an effective mechanism for creating document translation models that can be learned from only parallel sentences and monolingual documents---a compelling benefit as parallel documents are not always available. In our formulation, the posterior probability of a candidate translation is the product of the unconditional (prior) probability of the candidate output document and the "reverse translation probability" of translating the candidate output back into the source language. Our proposed model uses a powerful autoregressive language model as the prior on target language documents, but it assumes that each sentence is translated independently from the target to the source language. Crucially, at test time, when a source document is observed, the document language model prior induces dependencies between the translations of the source sentences in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTest · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam
