WikiRank: Improving Keyphrase Extraction Based on Background Knowledge
Yang Yu, Vincent Ng

TL;DR
WikiRank is an unsupervised keyphrase extraction method that leverages Wikipedia as background knowledge, constructing a semantic graph to identify the most relevant keyphrases with improved accuracy.
Contribution
It introduces a novel approach that incorporates background knowledge from Wikipedia into keyphrase extraction via a semantic graph and optimization framework.
Findings
Over 2% improvement in F1-score over state-of-the-art models
Effective use of Wikipedia as background knowledge
Unsupervised method suitable for diverse documents
Abstract
Keyphrase is an efficient representation of the main idea of documents. While background knowledge can provide valuable information about documents, they are rarely incorporated in keyphrase extraction methods. In this paper, we propose WikiRank, an unsupervised method for keyphrase extraction based on the background knowledge from Wikipedia. Firstly, we construct a semantic graph for the document. Then we transform the keyphrase extraction problem into an optimization problem on the graph. Finally, we get the optimal keyphrase set to be the output. Our method obtains improvements over other state-of-art models by more than 2% in F1-score.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques
