# Simple and Effective Noisy Channel Modeling for Neural Machine   Translation

**Authors:** Kyra Yee, Nathan Ng, Yann N. Dauphin, Michael Auli

arXiv: 1908.05731 · 2019-08-19

## TL;DR

This paper introduces a simple, effective neural noisy channel model for machine translation that leverages full source context, outperforming traditional models and strong baselines across multiple language pairs.

## Contribution

It proposes a novel approach using standard sequence-to-sequence models for noisy channel modeling, avoiding complex latent variable structures and partial source processing.

## Key findings

- Outperforms direct models by up to 3.2 BLEU on WMT'17 German-English translation.
- Consistently outperforms right-to-left reranking and ensemble models.
- Works well across four different language pairs.

## Abstract

Previous work on neural noisy channel modeling relied on latent variable models that incrementally process the source and target sentence. This makes decoding decisions based on partial source prefixes even though the full source is available. We pursue an alternative approach based on standard sequence to sequence models which utilize the entire source. These models perform remarkably well as channel models, even though they have neither been trained on, nor designed to factor over incomplete target sentences. Experiments with neural language models trained on billions of words show that noisy channel models can outperform a direct model by up to 3.2 BLEU on WMT'17 German-English translation. We evaluate on four language-pairs and our channel models consistently outperform strong alternatives such right-to-left reranking models and ensembles of direct models.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.05731/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1908.05731/full.md

## References

25 references — full list in the complete paper: https://tomesphere.com/paper/1908.05731/full.md

---
Source: https://tomesphere.com/paper/1908.05731