Conditional set generation using Seq2seq models
Aman Madaan, Dheeraj Rajagopal, Niket Tandon, Yiming Yang, Antoine, Bosselut

TL;DR
This paper introduces a novel data augmentation method for Seq2Seq models that improves set generation tasks by effectively modeling order-invariance and cardinality, leading to significant performance gains.
Contribution
It proposes a new algorithm for sampling label orders and jointly modeling set size and output, enhancing Seq2Seq models without extra annotations.
Findings
20% average relative improvement on benchmarks
Effective augmentation for various Seq2Seq models
Leverages order-invariance and cardinality properties
Abstract
Conditional set generation learns a mapping from an input sequence of tokens to a set. Several NLP tasks, such as entity typing and dialogue emotion tagging, are instances of set generation. Seq2Seq models, a popular choice for set generation, treat a set as a sequence and do not fully leverage its key properties, namely order-invariance and cardinality. We propose a novel algorithm for effectively sampling informative orders over the combinatorial space of label orders. We jointly model the set cardinality and output by prepending the set size and taking advantage of the autoregressive factorization used by Seq2Seq models. Our method is a model-independent data augmentation approach that endows any Seq2Seq model with the signals of order-invariance and cardinality. Training a Seq2Seq model on this augmented data (without any additional annotations) gets an average relative improvement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning in Bioinformatics · Multimodal Machine Learning Applications
MethodsGated Linear Unit · 15 Ways to Contact How can i speak to someone at Delta Airlines · Multi-Head Attention · Attention Is All You Need · Adafactor · Inverse Square Root Schedule · SentencePiece · T5 · BART · Linear Layer
