Towards Better Chinese-centric Neural Machine Translation for Low-resource Languages
Bin Li, Yixuan Weng, Fei Xia, Hanjun Deng

TL;DR
This paper presents a Chinese-centric neural machine translation system for low-resource languages, introducing novel techniques like Incomplete-Trust loss and bilingual curriculum learning, achieving superior performance in a competitive setting.
Contribution
The paper introduces a new Incomplete-Trust loss function and combines multiple strategies to improve low-resource Chinese NMT, filling a gap in non-English low-resource translation research.
Findings
Outperforms state-of-the-art methods on low-resource Chinese translation tasks.
Demonstrates effectiveness of Incomplete-Trust loss over traditional cross-entropy.
Shows benefits of bilingual curriculum learning and monolingual data enhancement.
Abstract
The last decade has witnessed enormous improvements in science and technology, stimulating the growing demand for economic and cultural exchanges in various countries. Building a neural machine translation (NMT) system has become an urgent trend, especially in the low-resource setting. However, recent work tends to study NMT systems for low-resource languages centered on English, while few works focus on low-resource NMT systems centered on other languages such as Chinese. To achieve this, the low-resource multilingual translation challenge of the 2021 iFLYTEK AI Developer Competition provides the Chinese-centric multilingual low-resource NMT tasks, where participants are required to build NMT systems based on the provided low-resource samples. In this paper, we present the winner competition system that leverages monolingual word embeddings data enhancement, bilingual curriculum…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
