Incorporating Discrete Translation Lexicons into Neural Machine   Translation

Philip Arthur; Graham Neubig; Satoshi Nakamura

arXiv:1606.02006·cs.CL·October 6, 2016·35 cites

Incorporating Discrete Translation Lexicons into Neural Machine Translation

Philip Arthur, Graham Neubig, Satoshi Nakamura

PDF

Open Access 2 Repos

TL;DR

This paper introduces a method to improve neural machine translation by integrating discrete translation lexicons, which enhances translation accuracy for low-frequency words and accelerates training convergence.

Contribution

The authors propose a novel approach to incorporate discrete lexicons into NMT using attention-based probability calculation and two combination methods, improving translation quality.

Findings

01

BLEU score improved by 2.0-2.3 points

02

Faster convergence during training

03

Enhanced translation of low-frequency words

Abstract

Neural machine translation (NMT) often makes mistakes in translating low-frequency content words that are essential to understanding the meaning of the sentence. We propose a method to alleviate this problem by augmenting NMT systems with discrete translation lexicons that efficiently encode translations of these low-frequency words. We describe a method to calculate the lexicon probability of the next word in the translation candidate by using the attention vector of the NMT model to select which source word lexical probabilities the model should focus on. We test two methods to combine this probability with the standard NMT probability: (1) using it as a bias, and (2) linear interpolation. Experiments on two corpora show an improvement of 2.0-2.3 BLEU and 0.13-0.44 NIST score, and faster convergence time.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications