An Efficient Character-Level Neural Machine Translation
Shenjian Zhao, Zhihua Zhang

TL;DR
This paper introduces an efficient deep character-level neural machine translation model that overcomes large vocabulary issues, improves training speed and memory efficiency, and can handle misspelled words like humans.
Contribution
It proposes a novel architecture with decimator and interpolator components for effective character-level translation, enhancing speed and robustness.
Findings
Achieves comparable translation quality to state-of-the-art systems.
Significantly reduces training time and memory usage.
Capable of translating misspelled words effectively.
Abstract
Neural machine translation aims at building a single large neural network that can be trained to maximize translation performance. The encoder-decoder architecture with an attention mechanism achieves a translation performance comparable to the existing state-of-the-art phrase-based systems on the task of English-to-French translation. However, the use of large vocabulary becomes the bottleneck in both training and improving the performance. In this paper, we propose an efficient architecture to train a deep character-level neural machine translation by introducing a decimator and an interpolator. The decimator is used to sample the source sequence before encoding while the interpolator is used to resample after decoding. Such a deep model has two major advantages. It avoids the large vocabulary issue radically; at the same time, it is much faster and more memory-efficient in training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Handwritten Text Recognition Techniques
