The NiuTrans Machine Translation Systems for WMT21
Shuhan Zhou, Tao Zhou, Binghao Wei, Yingfeng Luo, Yongyu Mu, Zefan, Zhou, Chenglong Wang, Xuanjun Zhou, Chuanhao Lv, Yi Jing, Laohu Wang, Jingnan, Zhang, Canan Huang, Zhongxiang Yan, Chi Hu, Bei Li, Tong Xiao, Jingbo Zhu

TL;DR
This paper details the NiuTrans neural machine translation systems for WMT21, employing advanced Transformer variants and techniques like back-translation and knowledge distillation to improve translation quality across multiple language pairs.
Contribution
Introduces NiuTrans systems using novel Transformer variants and multiple enhancement techniques for multilingual translation in WMT21.
Findings
Achieved competitive translation performance across 9 language directions.
Effectively utilized back-translation and knowledge distillation to boost model accuracy.
Demonstrated the effectiveness of Transformer-DLCL and ODE-Transformer architectures.
Abstract
This paper describes NiuTrans neural machine translation systems of the WMT 2021 news translation tasks. We made submissions to 9 language directions, including EnglishChinese, Japanese, Russian, Icelandic and EnglishHausa tasks. Our primary systems are built on several effective variants of Transformer, e.g., Transformer-DLCL, ODE-Transformer. We also utilize back-translation, knowledge distillation, post-ensemble, and iterative fine-tuning techniques to enhance the model performance further.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Dense Connections · Softmax · Layer Normalization · Label Smoothing · Residual Connection
