An Empirical Study on Chinese Character Decomposition in Multiword Expression-Aware Neural Machine Translation
Lifeng Han, Gareth J. F. Jones, Alan F. Smeaton

TL;DR
This paper systematically investigates how Chinese character decomposition impacts multi-word expression-aware neural machine translation, addressing unique challenges of Chinese language processing and improving translation quality.
Contribution
It provides the first comprehensive analysis of Chinese character decomposition in MWE-aware NMT, highlighting its benefits for meaning preservation and translation accuracy.
Findings
Chinese character decomposition enhances MWE translation quality
Decomposition improves semantic representation of Chinese words
The approach effectively addresses MWE translation challenges in Chinese
Abstract
Word meaning, representation, and interpretation play fundamental roles in natural language understanding (NLU), natural language processing (NLP), and natural language generation (NLG) tasks. Many of the inherent difficulties in these tasks stem from Multi-word Expressions (MWEs), which complicate the tasks by introducing ambiguity, idiomatic expressions, infrequent usage, and a wide range of variations. Significant effort and substantial progress have been made in addressing the challenging nature of MWEs in Western languages, particularly English. This progress is attributed in part to the well-established research communities and the abundant availability of computational resources. However, the same level of progress is not true for language families such as Chinese and closely related Asian languages, which continue to lag behind in this regard. While sub-word modelling has been…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Sentiment Analysis and Opinion Mining
