TL;DR
This paper introduces a neural unlikelihood training method for sequence-to-sequence keyphrase generation, significantly improving output diversity while maintaining quality, addressing the issue of repetitive keyphrases common in traditional models.
Contribution
The paper proposes a novel UL training approach at token and copy levels, combined with K-step ahead prediction, to enhance diversity in keyphrase generation models.
Findings
Significant increase in output diversity with the proposed method.
Maintained competitive quality metrics compared to baseline models.
Effective across multiple domain datasets.
Abstract
In this paper, we study sequence-to-sequence (S2S) keyphrase generation models from the perspective of diversity. Recent advances in neural natural language generation have made possible remarkable progress on the task of keyphrase generation, demonstrated through improvements on quality metrics such as F1-score. However, the importance of diversity in keyphrase generation has been largely ignored. We first analyze the extent of information redundancy present in the outputs generated by a baseline model trained using maximum likelihood estimation (MLE). Our findings show that repetition of keyphrases is a major issue with MLE training. To alleviate this issue, we adopt neural unlikelihood (UL) objective for training the S2S model. Our version of UL training operates at (1) the target token level to discourage the generation of repeating tokens; (2) the copy token level to avoid copying…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
