Fast Structured Decoding for Sequence Models
Zhiqing Sun, Zhuohan Li, Haoqing Wang, Zi Lin, Di He, Zhi-Hong Deng

TL;DR
This paper introduces a structured inference module using an efficient CRF approximation for non-autoregressive sequence models, significantly improving translation accuracy with minimal latency increase.
Contribution
It proposes a novel structured decoding method incorporating a dynamic transition CRF to enhance non-autoregressive models' consistency and performance.
Findings
Achieves higher BLEU scores than previous non-autoregressive models.
Increases inference speed by only 8-14ms.
Outperforms prior non-autoregressive baselines on multiple datasets.
Abstract
Autoregressive sequence models achieve state-of-the-art performance in domains like machine translation. However, due to the autoregressive factorization nature, these models suffer from heavy latency during inference. Recently, non-autoregressive sequence models were proposed to reduce the inference time. However, these models assume that the decoding process of each token is conditionally independent of others. Such a generation process sometimes makes the output sentence inconsistent, and thus the learned non-autoregressive models could only achieve inferior accuracy compared to their autoregressive counterparts. To improve then decoding consistency and reduce the inference cost at the same time, we propose to incorporate a structured inference module into the non-autoregressive models. Specifically, we design an efficient approximation for Conditional Random Fields (CRF) for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Natural Language Processing Techniques · Topic Modeling
MethodsConditional Random Field
