Compositional Representation of Morphologically-Rich Input for Neural   Machine Translation

Duygu Ataman; Marcello Federico

arXiv:1805.02036·cs.CL·May 8, 2018

Compositional Representation of Morphologically-Rich Input for Neural Machine Translation

Duygu Ataman, Marcello Federico

PDF

TL;DR

This paper introduces a neural machine translation method that uses a bi-directional RNN to create compositional input representations, improving translation accuracy for morphologically rich languages, especially in low-resource settings.

Contribution

It replaces traditional sub-word segmentation with a compositional approach using RNNs, enhancing translation quality across diverse morphological typologies.

Findings

01

Outperforms statistical sub-word segmentation methods by 1.71 to 2.48 BLEU points.

02

Effective in low-resource scenarios with multiple languages.

03

Consistently improves translation accuracy for morphologically rich languages.

Abstract

Neural machine translation (NMT) models are typically trained with fixed-size input and output vocabularies, which creates an important bottleneck on their accuracy and generalization capability. As a solution, various studies proposed segmenting words into sub-word units and performing translation at the sub-lexical level. However, statistical word segmentation methods have recently shown to be prone to morphological errors, which can lead to inaccurate translations. In this paper, we propose to overcome this problem by replacing the source-language embedding layer of NMT with a bi-directional recurrent neural network that generates compositional representations of the input at any desired level of granularity. We test our approach in a low-resource setting with five languages from different morphological typologies, and under different composition assumptions. By training NMT to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.