The Scientific Contribution Graph: Automated Literature-based Technological Roadmapping at Scale
Peter A. Jansen

TL;DR
This paper introduces the Scientific Contribution Graph, a large-scale AI/NLP resource linking scientific contributions and prerequisites to enable automated technological roadmapping and discovery.
Contribution
It presents a novel large-scale graph of scientific contributions and prerequisites, and introduces the task of scientific prerequisite prediction for discovery.
Findings
The graph contains 2 million contributions from 230k papers.
Models achieve 0.48 MAP on predicting scientific prerequisites.
The resource supports impact assessment and automated discovery.
Abstract
Scientific contributions rarely develop in isolation, but instead build upon prior discoveries. We formulate the task of automated technological roadmapping as extracting scientific contributions from scholarly articles and linking them to their prerequisites. We present the Scientific Contribution Graph, a large-scale AI/NLP-domain resource containing 2 million detailed scientific contributions extracted from 230k open-access papers and connected by 12.5 million prerequisite edges. We further introduce scientific prerequisite prediction, a scientific discovery task in which models predict which existing technologies can enable future discoveries, and show that contemporary models are rapidly improving on this task, reaching 0.48 MAP when evaluated using temporally filtered backtesting. We anticipate technological roadmapping resources such as this will support scientific impact…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
