# A Simple and Effective Approach to Automatic Post-Editing with Transfer   Learning

**Authors:** Gon\c{c}alo M. Correia, Andr\'e F. T. Martins

arXiv: 1906.06253 · 2019-06-17

## TL;DR

This paper introduces a transfer learning approach for automatic post-editing that fine-tunes pre-trained BERT models, achieving competitive results with significantly less data and training time, and setting new state-of-the-art when combined with artificial data.

## Contribution

It presents a novel transfer learning method for APE using pre-trained BERT models, reducing data and training requirements while maintaining high performance.

## Key findings

- Competitive results with only 23K sentences and 3 hours of training.
- State-of-the-art performance achieved when combining with artificial data.
- Significant reduction in training data and time compared to traditional methods.

## Abstract

Automatic post-editing (APE) seeks to automatically refine the output of a black-box machine translation (MT) system through human post-edits. APE systems are usually trained by complementing human post-edited data with large, artificial data generated through back-translations, a time-consuming process often no easier than training an MT system from scratch. In this paper, we propose an alternative where we fine-tune pre-trained BERT models on both the encoder and decoder of an APE system, exploring several parameter sharing strategies. By only training on a dataset of 23K sentences for 3 hours on a single GPU, we obtain results that are competitive with systems that were trained on 5M artificial sentences. When we add this artificial data, our method obtains state-of-the-art results.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.06253/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1906.06253/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/1906.06253/full.md

---
Source: https://tomesphere.com/paper/1906.06253