# Learning to Refine Source Representations for Neural Machine Translation

**Authors:** Xinwei Geng, Longyue Wang, Xing Wang, Bing Qin, Ting Liu, Zhaopeng Tu

arXiv: 1812.10230 · 2018-12-27

## TL;DR

This paper introduces a novel encoder-refiner-decoder framework for neural machine translation that dynamically refines source representations during decoding, leading to improved translation quality with efficient refinement strategies.

## Contribution

It proposes a new dynamic refinement mechanism with reinforcement learning to decide when to refine, enhancing translation accuracy over standard models.

## Key findings

- Significant improvement in translation quality on Chinese-English and English-German tasks.
- Refinement strategy maintains reasonable decoding speed while boosting performance.
- Dynamic refinement outperforms static approaches in NMT.

## Abstract

Neural machine translation (NMT) models generally adopt an encoder-decoder architecture for modeling the entire translation process. The encoder summarizes the representation of input sentence from scratch, which is potentially a problem if the sentence is ambiguous. When translating a text, humans often create an initial understanding of the source sentence and then incrementally refine it along the translation on the target side. Starting from this intuition, we propose a novel encoder-refiner-decoder framework, which dynamically refines the source representations based on the generated target-side information at each decoding step. Since the refining operations are time-consuming, we propose a strategy, leveraging the power of reinforcement learning models, to decide when to refine at specific decoding steps. Experimental results on both Chinese-English and English-German translation tasks show that the proposed approach significantly and consistently improves translation performance over the standard encoder-decoder framework. Furthermore, when refining strategy is applied, results still show reasonable improvement over the baseline without much decrease in decoding speed.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.10230/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1812.10230/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/1812.10230/full.md

---
Source: https://tomesphere.com/paper/1812.10230