Microsoft Research Asia's Systems for WMT19
Yingce Xia, Xu Tan, Fei Tian, Fei Gao, Weicong Chen, Yang Fan, Linyuan, Gong, Yichong Leng, Renqian Luo, Yiren Wang, Lijun Wu, Jinhua Zhu, Tao Qin,, Tie-Yan Liu

TL;DR
Microsoft Research Asia's systems for WMT19 achieved top rankings across multiple language directions by integrating advanced techniques like Transformer, back translation, knowledge distillation, MADL, MASS, NAO, and SCA.
Contribution
The paper presents a comprehensive system combining multiple recent techniques to significantly improve machine translation performance in WMT19.
Findings
Won 8 first places and 3 second places in WMT19 translation tasks.
Enhanced baseline systems with techniques like MADL, MASS, NAO, and SCA.
Demonstrated the effectiveness of integrated advanced methods in translation quality.
Abstract
We Microsoft Research Asia made submissions to 11 language directions in the WMT19 news translation tasks. We won the first place for 8 of the 11 directions and the second place for the other three. Our basic systems are built on Transformer, back translation and knowledge distillation. We integrate several of our rececent techniques to enhance the baseline systems: multi-agent dual learning (MADL), masked sequence-to-sequence pre-training (MASS), neural architecture optimization (NAO), and soft contextual data augmentation (SCA).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Adam · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Multi-Head Attention · Byte Pair Encoding · Dense Connections
