The University of Edinburgh's Neural MT Systems for WMT17
Rico Sennrich, Alexandra Birch, Anna Currey, Ulrich Germann, Barry, Haddow, Kenneth Heafield, Antonio Valerio Miceli Barone, Philip Williams

TL;DR
This paper details the University of Edinburgh's neural machine translation systems for WMT17, highlighting their use of deep architectures, layer normalization, and BPE-based models across multiple language pairs, with extensive experimental analysis.
Contribution
Introduces deep neural architectures, layer normalization, and improved BPE segmentation techniques for neural machine translation, building on previous setups with new innovations.
Findings
Deep architectures and layer normalization improve translation quality.
Back-translation and monolingual data enhance system performance.
Ensembling techniques further boost translation accuracy.
Abstract
This paper describes the University of Edinburgh's submissions to the WMT17 shared news translation and biomedical translation tasks. We participated in 12 translation directions for news, translating between English and Czech, German, Latvian, Russian, Turkish and Chinese. For the biomedical task we submitted systems for English to Czech, German, Polish and Romanian. Our systems are neural machine translation systems trained with Nematus, an attentional encoder-decoder. We follow our setup from last year and build BPE-based models with parallel and back-translated monolingual training data. Novelties this year include the use of deep architectures, layer normalization, and more compact models due to weight tying and improvements in BPE segmentations. We perform extensive ablative experiments, reporting on the effectivenes of layer normalization, deep architectures, and different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsByte Pair Encoding · Weight Tying
