Coherent Keyphrase Extraction via Web Mining
Peter D. Turney (National Research Council of Canada)

TL;DR
This paper improves automatic keyphrase extraction by enhancing coherence through web mining-based statistical association, enabling more semantically related keyphrases across different domains.
Contribution
It introduces a web mining-based enhancement to the Kea algorithm to improve keyphrase coherence and domain generalization.
Findings
Enhanced algorithm produces more coherent keyphrases
Improved cross-domain performance demonstrated
Web mining effectively measures semantic association
Abstract
Keyphrases are useful for a variety of purposes, including summarizing, indexing, labeling, categorizing, clustering, highlighting, browsing, and searching. The task of automatic keyphrase extraction is to select keyphrases from within the text of a given document. Automatic keyphrase extraction makes it feasible to generate keyphrases for the huge number of documents that do not have manually assigned keyphrases. A limitation of previous keyphrase extraction algorithms is that the selected keyphrases are occasionally incoherent. That is, the majority of the output keyphrases may fit together well, but there may be a minority that appear to be outliers, with no clear semantic relation to the majority or to each other. This paper presents enhancements to the Kea keyphrase extraction algorithm that are designed to increase the coherence of the extracted keyphrases. The approach is to use…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques
