Neural Machine Translation with Characters and Hierarchical Encoding

Alexander Rosenberg Johansen; Jonas Meinertz Hansen; Elias Khazen; Obeid; Casper Kaae S{\o}nderby; Ole Winther

arXiv:1610.06550·cs.CL·October 21, 2016·2 cites

Neural Machine Translation with Characters and Hierarchical Encoding

Alexander Rosenberg Johansen, Jonas Meinertz Hansen, Elias Khazen, Obeid, Casper Kaae S{\o}nderby, Ole Winther

PDF

Open Access 1 Repo

TL;DR

This paper introduces a hierarchical character-to-word encoder for neural machine translation, which reduces computational complexity and enhances translation quality by effectively representing common and rare words.

Contribution

The paper presents a novel hierarchical char2word encoder that improves translation performance and reduces computational complexity compared to traditional character or word-based models.

Findings

01

Hierarchical encoding reduces computational complexity.

02

Model learns to compress common words into single embeddings.

03

Rare words are represented character by character.

Abstract

Most existing Neural Machine Translation models use groups of characters or whole words as their unit of input and output. We propose a model with a hierarchical char2word encoder, that takes individual characters both as input and output. We first argue that this hierarchical representation of the character encoder reduces computational complexity, and show that it improves translation performance. Secondly, by qualitatively studying attention plots from the decoder we find that the model learns to compress common words into a single embedding whereas rare words, such as names and places, are represented character by character.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Styrke/master-code
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications