TL;DR
This paper introduces a hybrid word-character neural machine translation model that handles open vocabulary translation efficiently, producing high-quality translations without unknown words, especially effective for complex languages like Czech.
Contribution
The paper presents a novel hybrid NMT system combining word-level translation with character-level modeling for rare words, improving open vocabulary translation.
Findings
Achieves a new state-of-the-art BLEU score of 20.7 on WMT'15 English-Czech translation.
Boosts BLEU scores by 2.1-11.4 points over previous models handling unknown words.
Successfully models complex inflected languages like Czech with well-formed word generation.
Abstract
Nearly all previous work on neural machine translation (NMT) has used quite restricted vocabularies, perhaps with a subsequent method to patch in unknown words. This paper presents a novel word-character solution to achieving open vocabulary NMT. We build hybrid systems that translate mostly at the word level and consult the character components for rare words. Our character-level recurrent neural networks compute source word representations and recover unknown target words when needed. The twofold advantage of such a hybrid approach is that it is much faster and easier to train than character-based ones; at the same time, it never produces unknown words as in the case of word-based models. On the WMT'15 English to Czech translation task, this hybrid approach offers an addition boost of +2.1-11.4 BLEU points over models that already handle unknown words. Our best system achieves a new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
