Fast Lexically Constrained Decoding with Dynamic Beam Allocation for   Neural Machine Translation

Matt Post; David Vilar

arXiv:1804.06609·cs.CL·November 13, 2018

Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation

Matt Post, David Vilar

PDF

TL;DR

This paper introduces a highly efficient algorithm for lexically constrained decoding in neural machine translation, enabling the inclusion of specific words or phrases with minimal computational overhead, and explores its impact on translation quality metrics.

Contribution

The paper presents a novel O(1) complexity algorithm for lexically constrained decoding, significantly improving efficiency over previous methods.

Findings

01

The algorithm effectively enforces lexical constraints in translation outputs.

02

It reveals a weak correlation between BLEU scores and model performance.

03

Implementation is available in the Sockeye toolkit.

Abstract

The end-to-end nature of neural machine translation (NMT) removes many ways of manually guiding the translation process that were available in older paradigms. Recent work, however, has introduced a new capability: lexically constrained or guided decoding, a modification to beam search that forces the inclusion of pre-specified words and phrases in the output. However, while theoretically sound, existing approaches have computational complexities that are either linear (Hokamp and Liu, 2017) or exponential (Anderson et al., 2017) in the number of constraints. We present a algorithm for lexically constrained decoding with a complexity of O(1) in the number of constraints. We demonstrate the algorithms remarkable ability to properly place these constraints, and use it to explore the shaky relationship between model and BLEU scores. Our implementation is available as part of Sockeye.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.