An Efficient Character-Level Neural Machine Translation

Shenjian Zhao; Zhihua Zhang

arXiv:1608.04738·cs.CL·August 22, 2016·5 cites

An Efficient Character-Level Neural Machine Translation

Shenjian Zhao, Zhihua Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces an efficient deep character-level neural machine translation model that overcomes large vocabulary issues, improves training speed and memory efficiency, and can handle misspelled words like humans.

Contribution

It proposes a novel architecture with decimator and interpolator components for effective character-level translation, enhancing speed and robustness.

Findings

01

Achieves comparable translation quality to state-of-the-art systems.

02

Significantly reduces training time and memory usage.

03

Capable of translating misspelled words effectively.

Abstract

Neural machine translation aims at building a single large neural network that can be trained to maximize translation performance. The encoder-decoder architecture with an attention mechanism achieves a translation performance comparable to the existing state-of-the-art phrase-based systems on the task of English-to-French translation. However, the use of large vocabulary becomes the bottleneck in both training and improving the performance. In this paper, we propose an efficient architecture to train a deep character-level neural machine translation by introducing a decimator and an interpolator. The decimator is used to sample the source sequence before encoding while the interpolator is used to resample after decoding. Such a deep model has two major advantages. It avoids the large vocabulary issue radically; at the same time, it is much faster and more memory-efficient in training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SwordYork/DCNMT
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Handwritten Text Recognition Techniques