# Plan, Attend, Generate: Character-level Neural Machine Translation with   Planning in the Decoder

**Authors:** Caglar Gulcehre, Francis Dutil, Adam Trischler, Yoshua Bengio

arXiv: 1706.05087 · 2017-06-26

## TL;DR

This paper introduces a planning-based mechanism into character-level neural machine translation, improving alignment quality and translation performance by explicitly modeling future alignments within an end-to-end trainable framework.

## Contribution

It presents a novel planning mechanism for NMT decoders that explicitly models future alignments, enhancing translation quality and interpretability.

## Key findings

- Outperforms strong baseline on WMT'15 corpus
- Computes intuitive alignments
- Achieves better performance with fewer parameters

## Abstract

We investigate the integration of a planning mechanism into an encoder-decoder architecture with an explicit alignment for character-level machine translation. We develop a model that plans ahead when it computes alignments between the source and target sequences, constructing a matrix of proposed future alignments and a commitment vector that governs whether to follow or recompute the plan. This mechanism is inspired by the strategic attentive reader and writer (STRAW) model. Our proposed model is end-to-end trainable with fully differentiable operations. We show that it outperforms a strong baseline on three character-level decoder neural machine translation on WMT'15 corpus. Our analysis demonstrates that our model can compute qualitatively intuitive alignments and achieves superior performance with fewer parameters.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1706.05087/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1706.05087/full.md

## References

16 references — full list in the complete paper: https://tomesphere.com/paper/1706.05087/full.md

---
Source: https://tomesphere.com/paper/1706.05087