# Syntactically Supervised Transformers for Faster Neural Machine   Translation

**Authors:** Nader Akoury, Kalpesh Krishna, Mohit Iyyer

arXiv: 1906.02780 · 2020-10-06

## TL;DR

This paper introduces SynST, a syntactically supervised Transformer that predicts parse trees to enable faster, high-quality neural machine translation by generating multiple tokens simultaneously.

## Contribution

The paper proposes a novel Transformer model that predicts parse trees to improve inference speed while maintaining or enhancing translation quality.

## Key findings

- SynST decodes ~5x faster than baseline Transformer.
- SynST achieves higher BLEU scores than most competing methods.
- Controlled experiments validate the effectiveness of syntactic supervision.

## Abstract

Standard decoders for neural machine translation autoregressively generate a single target token per time step, which slows inference especially for long outputs. While architectural advances such as the Transformer fully parallelize the decoder computations at training time, inference still proceeds sequentially. Recent developments in non- and semi- autoregressive decoding produce multiple tokens per time step independently of the others, which improves inference speed but deteriorates translation quality. In this work, we propose the syntactically supervised Transformer (SynST), which first autoregressively predicts a chunked parse tree before generating all of the target tokens in one shot conditioned on the predicted parse. A series of controlled experiments demonstrates that SynST decodes sentences ~ 5x faster than the baseline autoregressive Transformer while achieving higher BLEU scores than most competing methods on En-De and En-Fr datasets.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.02780/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1906.02780/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/1906.02780/full.md

---
Source: https://tomesphere.com/paper/1906.02780