Mixture Content Selection for Diverse Sequence Generation

Jaemin Cho; Minjoon Seo; Hannaneh Hajishirzi

arXiv:1909.01953·cs.CL·September 5, 2019·6 cites

Mixture Content Selection for Diverse Sequence Generation

Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi

PDF

Open Access 1 Repo

TL;DR

This paper introduces SELECTOR, a plug-and-play module that enhances diversity in sequence generation by explicitly separating content selection from generation, leading to improved accuracy and efficiency in NLP tasks.

Contribution

It proposes a novel mixture of experts approach with stochastic hard-EM training for explicit content diversification in sequence generation models.

Findings

01

Achieved state-of-the-art top-1 accuracy on question generation and summarization datasets.

02

Gained a 6% improvement in top-5 accuracy.

03

Reduced training time by 3.7 times compared to previous models.

Abstract

Generating diverse sequences is important in many NLP applications such as question generation or summarization that exhibit semantically one-to-many relationships between source and the target sequences. We present a method to explicitly separate diversification from generation using a general plug-and-play module (called SELECTOR) that wraps around and guides an existing encoder-decoder model. The diversification stage uses a mixture of experts to sample different binary masks on the source sequence for diverse content selection. The generation stage uses a standard encoder-decoder model given each selected content from the source sequence. Due to the non-differentiable nature of discrete sampling and the lack of ground truth labels for binary mask, we leverage a proxy for ground truth mask and adopt stochastic hard-EM for training. In question generation (SQuAD) and abstractive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

clovaai/FocusSeq2Seq
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems