A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models
Elman Mansimov, Alex Wang, Sean Welleck, Kyunghyun Cho

TL;DR
This paper introduces a unified framework for sequence generation that applies to both directed and undirected neural models, enabling new decoding strategies and achieving competitive translation results.
Contribution
It proposes a generalized model of sequence generation that unifies decoding in directed and undirected models, facilitating the adaptation of decoding algorithms across these paradigms.
Findings
Achieves constant-time translation comparable to linear-time methods.
Demonstrates competitive performance on WMT'14 English-German translation.
Unifies various neural sequence models under a single framework.
Abstract
Undirected neural sequence models such as BERT (Devlin et al., 2019) have received renewed interest due to their success on discriminative natural language understanding tasks such as question-answering and natural language inference. The problem of generating sequences directly from these models has received relatively little attention, in part because generating from undirected models departs significantly from conventional monotonic generation in directed sequence models. We investigate this problem by proposing a generalized model of sequence generation that unifies decoding in directed and undirected models. The proposed framework models the process of generation rather than the resulting sequence, and under this framework, we derive various neural sequence models as special cases, such as autoregressive, semi-autoregressive, and refinement-based non-autoregressive models. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsLinear Layer · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam · WordPiece · Softmax
