The Volctrans Machine Translation System for WMT20

Liwei Wu; Xiao Pan; Zehui Lin; Yaoming Zhu; Mingxuan Wang; Lei Li

arXiv:2010.14806·cs.CL·November 20, 2020

The Volctrans Machine Translation System for WMT20

Liwei Wu, Xiao Pan, Zehui Lin, Yaoming Zhu, Mingxuan Wang, Lei Li

PDF

TL;DR

This paper presents the VolcTrans machine translation system developed for the WMT20 shared task, utilizing Transformer variants, data augmentation, and multilingual pre-training to improve translation quality across eight language pairs.

Contribution

The paper introduces a comprehensive translation system combining Transformer variants, data selection, synthetic data, ensemble methods, and multilingual pre-training for WMT20.

Findings

01

Achieved competitive translation performance across multiple language pairs.

02

Demonstrated effectiveness of data augmentation and ensemble techniques.

03

Showcased benefits of multilingual pre-training in translation quality.

Abstract

This paper describes our VolcTrans system on WMT20 shared news translation task. We participated in 8 translation directions. Our basic systems are based on Transformer, with several variants (wider or deeper Transformers, dynamic convolutions). The final system includes text pre-process, data selection, synthetic data generation, advanced model ensemble, and multilingual pre-training.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Attention Is All You Need · Residual Connection · Multi-Head Attention · Layer Normalization · Byte Pair Encoding · Softmax · Adam