PubMed knowledge graph 2.0: Connecting papers, patents, and clinical trials in biomedical science
Jian Xu, Chao Yu, Jiawei Xu, Vetle I. Torvik, Jaewoo Kang, Mujeen Sung, Min Song, Yi Bu, Ying Ding

TL;DR
PKG 2.0 is a large, integrated biomedical knowledge graph connecting papers, patents, and clinical trials, enabling systematic analysis and discovery in biomedical research.
Contribution
The paper introduces PKG 2.0, a comprehensive, multi-source biomedical knowledge graph with extensive entity linkages and citation data, addressing data fragmentation issues.
Findings
PKG 2.0 contains over 36 million papers, 1.3 million patents, and 0.48 million clinical trials.
High performance in author disambiguation and biomedical entity recognition tasks.
Demonstrates the dataset's utility for literature mining and biomedical research.
Abstract
Papers, patents, and clinical trials are essential scientific resources in biomedicine, crucial for knowledge sharing and dissemination. However, these documents are often stored in disparate databases with varying management standards and data formats, making it challenging to form systematic and fine-grained connections among them. To address this issue, we construct PKG 2.0, a comprehensive knowledge graph dataset encompassing over 36 million papers, 1.3 million patents, and 0.48 million clinical trials in the biomedical field. PKG 2.0 integrates these dispersed resources through 482 million biomedical entity linkages, 19 million citation linkages, and 7 million project linkages. The construction of PKG 2.0 wove together fine-grained biomedical entity extraction, high-performance author name disambiguation, multi-source citation integration, and high-quality project data from the NIH…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAcademic Publishing and Open Access · Biomedical and Engineering Education · Genetics, Bioinformatics, and Biomedical Research
