Word, Subword or Character? An Empirical Study of Granularity in Chinese-English NMT
Yining Wang, Long Zhou, Jiajun Zhang, Chengqing Zong

TL;DR
This paper empirically compares character, subword, and hybrid translation granularities in Chinese-English NMT, revealing that subword models excel in Chinese-to-English while hybrid models are best for English-to-Chinese translation.
Contribution
It provides the first comprehensive analysis of translation granularity effects specifically for Chinese in NMT, guiding optimal granularity choices.
Findings
Subword models perform best for Chinese-to-English translation.
Hybrid word-character models are most suitable for English-to-Chinese translation.
Hybrid_BPE achieves the best results in Chinese-to-English translation.
Abstract
Neural machine translation (NMT), a new approach to machine translation, has been proved to outperform conventional statistical machine translation (SMT) across a variety of language pairs. Translation is an open-vocabulary problem, but most existing NMT systems operate with a fixed vocabulary, which causes the incapability of translating rare words. This problem can be alleviated by using different translation granularities, such as character, subword and hybrid word-character. Translation involving Chinese is one of the most difficult tasks in machine translation, however, to the best of our knowledge, there has not been any other work exploring which translation granularity is most suitable for Chinese in NMT. In this paper, we conduct an extensive comparison using Chinese-English NMT as a case study. Furthermore, we discuss the advantages and disadvantages of various translation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
