# The University of Sydney's Machine Translation System for WMT19

**Authors:** Liang Ding, Dacheng Tao

arXiv: 1907.00494 · 2019-07-02

## TL;DR

This paper details the University of Sydney's advanced Transformer-based machine translation system for WMT19, achieving state-of-the-art BLEU scores through innovative data augmentation and ensemble techniques.

## Contribution

Introduction of novel augmentation methods and data strategies that significantly improve translation quality in Transformer-based systems.

## Key findings

- Achieved the highest BLEU score of 33.0 among WMT19 participants.
- System outperforms baseline by approximately 5.3 BLEU points.
- Effective integration of recent research strategies enhances translation performance.

## Abstract

This paper describes the University of Sydney's submission of the WMT 2019 shared news translation task. We participated in the Finnish$\rightarrow$English direction and got the best BLEU(33.0) score among all the participants. Our system is based on the self-attentional Transformer networks, into which we integrated the most recent effective strategies from academic research (e.g., BPE, back translation, multi-features data selection, data augmentation, greedy model ensemble, reranking, ConMBR system combination, and post-processing). Furthermore, we propose a novel augmentation method $Cycle Translation$ and a data mixture strategy $Big$/$Small$ parallel construction to entirely exploit the synthetic corpus. Extensive experiments show that adding the above techniques can make continuous improvements of the BLEU scores, and the best result outperforms the baseline (Transformer ensemble model trained with the original parallel corpus) by approximately 5.3 BLEU score, achieving the state-of-the-art performance.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.00494/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1907.00494/full.md

## References

33 references — full list in the complete paper: https://tomesphere.com/paper/1907.00494/full.md

---
Source: https://tomesphere.com/paper/1907.00494