One2Set: Generating Diverse Keyphrases as a Set
Jiacheng Ye, Tao Gui, Yichao Luo, Yige Xu, Qi Zhang

TL;DR
This paper introduces One2Set, a novel approach for keyphrase generation that treats keyphrases as an unordered set, using parallel generation with control codes and bipartite matching to improve diversity and accuracy.
Contribution
The paper proposes a new training paradigm and model that generate keyphrases as an unordered set, avoiding order bias and enhancing diversity compared to previous sequence-based methods.
Findings
Outperforms state-of-the-art methods on multiple benchmarks.
Increases diversity and reduces duplication in generated keyphrases.
Effectively models keyphrases as an unordered set.
Abstract
Recently, the sequence-to-sequence models have made remarkable progress on the task of keyphrase generation (KG) by concatenating multiple keyphrases in a predefined order as a target sequence during training. However, the keyphrases are inherently an unordered set rather than an ordered sequence. Imposing a predefined order will introduce wrong bias during training, which can highly penalize shifts in the order between keyphrases. In this work, we propose a new training paradigm One2Set without predefining an order to concatenate the keyphrases. To fit this paradigm, we propose a novel model that utilizes a fixed set of learned control codes as conditions to generate a set of keyphrases in parallel. To solve the problem that there is no correspondence between each prediction and target during training, we propose a -step target assignment mechanism via bipartite matching, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques
