Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than Character Level
Lifeng Han, Shaohui Kuang

TL;DR
This paper enhances Chinese-to-English neural machine translation by integrating Chinese radicals into the model, improving translation adequacy and outperforming baseline systems across multiple evaluation metrics.
Contribution
It introduces a novel method of incorporating Chinese radicals into NMT models, addressing OOV issues and improving translation quality for Chinese.
Findings
Radical-based models outperform baseline in adequacy metrics
Word boundary knowledge is crucial for Chinese NMT performance
Radical integration improves translation of unseen words
Abstract
In neural machine translation (NMT), researchers face the challenge of un-seen (or out-of-vocabulary OOV) words translation. To solve this, some researchers propose the splitting of western languages such as English and German into sub-words or compounds. In this paper, we try to address this OOV issue and improve the NMT adequacy with a harder language Chinese whose characters are even more sophisticated in composition. We integrate the Chinese radicals into the NMT model with different settings to address the unseen words challenge in Chinese to English translation. On the other hand, this also can be considered as semantic part of the MT system since the Chinese radicals usually carry the essential meaning of the words they are constructed in. Meaningful radicals and new characters can be integrated into the NMT systems with our models. We use an attention-based NMT system as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
