Neural Machine Translation with Characters and Hierarchical Encoding
Alexander Rosenberg Johansen, Jonas Meinertz Hansen, Elias Khazen, Obeid, Casper Kaae S{\o}nderby, Ole Winther

TL;DR
This paper introduces a hierarchical character-to-word encoder for neural machine translation, which reduces computational complexity and enhances translation quality by effectively representing common and rare words.
Contribution
The paper presents a novel hierarchical char2word encoder that improves translation performance and reduces computational complexity compared to traditional character or word-based models.
Findings
Hierarchical encoding reduces computational complexity.
Model learns to compress common words into single embeddings.
Rare words are represented character by character.
Abstract
Most existing Neural Machine Translation models use groups of characters or whole words as their unit of input and output. We propose a model with a hierarchical char2word encoder, that takes individual characters both as input and output. We first argue that this hierarchical representation of the character encoder reduces computational complexity, and show that it improves translation performance. Secondly, by qualitatively studying attention plots from the decoder we find that the model learns to compress common words into a single embedding whereas rare words, such as names and places, are represented character by character.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
