# The Transference Architecture for Automatic Post-Editing

**Authors:** Santanu Pal, Hongfei Xu, Nico Herbig, Sudip Kumar Naskar, Antonio, Krueger, Josef van Genabith

arXiv: 1908.06151 · 2019-08-27

## TL;DR

This paper introduces a novel multi-source automatic post-editing model called transference, which uses a transformer-based architecture to better integrate source and machine translation information, outperforming previous models on standard benchmarks.

## Contribution

The paper proposes a new transformer-based multi-source APE architecture with a second encoder, improving performance over state-of-the-art methods on multiple WMT benchmarks.

## Key findings

- The transference model outperforms previous state-of-the-art by 1 BLEU point.
- A sufficiently deep second encoder improves model performance.
- Reducing decoder layers has minimal impact on performance.

## Abstract

In automatic post-editing (APE) it makes sense to condition post-editing (pe) decisions on both the source (src) and the machine translated text (mt) as input. This has led to multi-source encoder based APE approaches. A research challenge now is the search for architectures that best support the capture, preparation and provision of src and mt information and its integration with pe decisions. In this paper we present a new multi-source APE model, called transference. Unlike previous approaches, it (i) uses a transformer encoder block for src, (ii) followed by a decoder block, but without masking for self-attention on mt, which effectively acts as second encoder combining src -> mt, and (iii) feeds this representation into a final decoder block generating pe. Our model outperforms the state-of-the-art by 1 BLEU point on the WMT 2016, 2017, and 2018 English--German APE shared tasks (PBSMT and NMT). We further investigate the importance of our newly introduced second encoder and find that a too small amount of layers does hurt the performance, while reducing the number of layers of the decoder does not matter much.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.06151/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1908.06151/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/1908.06151/full.md

---
Source: https://tomesphere.com/paper/1908.06151