High-Precision Extraction of Emerging Concepts from Scientific   Literature

Daniel King; Doug Downey; Daniel S. Weld

arXiv:2006.06877·cs.IR·March 24, 2021

High-Precision Extraction of Emerging Concepts from Scientific Literature

Daniel King, Doug Downey, Daniel S. Weld

PDF

1 Repo

TL;DR

This paper introduces an unsupervised method for extracting emerging scientific concepts with high precision by identifying influential papers that introduce or popularize these concepts, significantly outperforming previous techniques.

Contribution

The authors propose a novel unsupervised approach that leverages citation patterns to accurately identify new scientific concepts, achieving higher precision than existing methods.

Findings

01

Precision@1000 of 99% for the proposed method

02

Outperforms prior work with 86% Precision@1000

03

Provides code and data for further research

Abstract

Identification of new concepts in scientific literature can help power faceted search, scientific trend analysis, knowledge-base construction, and more, but current methods are lacking. Manual identification cannot keep up with the torrent of new publications, while the precision of existing automatic techniques is too low for many applications. We present an unsupervised concept extraction method for scientific literature that achieves much higher precision than previous work. Our approach relies on a simple but novel intuition: each scientific concept is likely to be introduced or popularized by a single paper that is disproportionately cited by subsequent papers mentioning the concept. From a corpus of computer science papers on arXiv, we find that our method achieves a Precision@1000 of 99%, compared to 86% for prior work, and a substantially better precision-yield trade-off across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

allenai/ForeCite
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.