Insertion-based Decoding with automatically Inferred Generation Order
Jiatao Gu, Qi Liu, Kyunghyun Cho

TL;DR
This paper introduces InDIGO, a flexible decoding algorithm that allows sequence generation in arbitrary orders, improving over traditional fixed-order methods across multiple tasks.
Contribution
It extends Transformer models to support adaptive, insertion-based decoding orders, enabling more flexible and potentially more effective sequence generation.
Findings
InDIGO achieves comparable or better performance than traditional methods.
The model adapts generation order based on input information.
Effective across tasks like translation, captioning, and code generation.
Abstract
Conventional neural autoregressive decoding commonly assumes a fixed left-to-right generation order, which may be sub-optimal. In this work, we propose a novel decoding algorithm -- InDIGO -- which supports flexible sequence generation in arbitrary orders through insertion operations. We extend Transformer, a state-of-the-art sequence generation model, to efficiently implement the proposed approach, enabling it to be trained with either a pre-defined generation order or adaptive orders obtained from beam-search. Experiments on four real-world tasks, including word order recovery, machine translation, image caption and code generation, demonstrate that our algorithm can generate sequences following arbitrary orders, while achieving competitive or even better performance compared to the conventional left-to-right generation. The generated sequences show that InDIGO adopts adaptive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Multimodal Machine Learning Applications · Topic Modeling
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Adam · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Multi-Head Attention · Byte Pair Encoding · Dense Connections
