On the Properties of Neural Machine Translation: Encoder-Decoder   Approaches

Kyunghyun Cho; Bart van Merrienboer; Dzmitry Bahdanau; Yoshua; Bengio

arXiv:1409.1259·cs.CL·October 8, 2014·1.1k cites

On the Properties of Neural Machine Translation: Encoder-Decoder Approaches

Kyunghyun Cho, Bart van Merrienboer, Dzmitry Bahdanau, Yoshua, Bengio

PDF

Open Access 3 Repos

TL;DR

This paper analyzes neural machine translation models, focusing on their performance with different sentence lengths and unknown words, and introduces a new gated recursive convolutional network that learns sentence structure.

Contribution

It provides an analysis of NMT properties using RNN and a novel gated recursive convolutional network that captures grammatical structure.

Findings

01

NMT performs well on short, known-word sentences.

02

Performance degrades with longer sentences and unknown words.

03

The new model learns grammatical structure automatically.

Abstract

Neural machine translation is a relatively new approach to statistical machine translation based purely on neural networks. The neural machine translation models often consist of an encoder and a decoder. The encoder extracts a fixed-length representation from a variable-length input sentence, and the decoder generates a correct translation from this representation. In this paper, we focus on analyzing the properties of the neural machine translation using two models; RNN Encoder--Decoder and a newly proposed gated recursive convolutional neural network. We show that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase. Furthermore, we find that the proposed gated recursive convolutional network learns a grammatical structure of a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Handwritten Text Recognition Techniques