An Integrated Approach for Keyphrase Generation via Exploring the Power of Retrieval and Extraction
Wang Chen, Hou Pong Chan, Piji Li, Lidong Bing, Irwin King

TL;DR
This paper introduces an integrated keyphrase generation method combining extractive, generative, and retrieval techniques within a multi-task learning framework, significantly improving performance over existing methods.
Contribution
The novel multi-task framework jointly trains extractive and generative models, incorporating retrieval-based external knowledge and a neural merging module for enhanced keyphrase generation.
Findings
Outperforms state-of-the-art methods on five benchmarks
Effectively combines extraction, generation, and retrieval for better results
Improves keyphrase relevance and diversity
Abstract
In this paper, we present a novel integrated approach for keyphrase generation (KG). Unlike previous works which are purely extractive or generative, we first propose a new multi-task learning framework that jointly learns an extractive model and a generative model. Besides extracting keyphrases, the output of the extractive model is also employed to rectify the copy probability distribution of the generative model, such that the generative model can better identify important contents from the given document. Moreover, we retrieve similar documents with the given document from training data and use their associated keyphrases as external knowledge for the generative model to produce more accurate keyphrases. For further exploiting the power of extraction and retrieval, we propose a neural-based merging module to combine and re-rank the predicted keyphrases from the enhanced generative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques
