NIH-MPINet: A Large-Scale Feature-Rich Network Dataset for Mapping the Frontiers of Team Science
Cuiran Shi, Shuying Han, Shreya Kusumanchi, Mia Zhou, Didong Li

TL;DR
This paper introduces NIH-MPINet, a comprehensive large-scale network dataset capturing NIH research collaborations from 2006 to 2023, enabling advanced analysis of biomedical research communities and trends.
Contribution
The creation and detailed characterization of NIH-MPINet, a feature-rich, large-scale collaboration network dataset for biomedical research, with community detection and temporal analysis.
Findings
Identified 19 research communities with distinct thematic profiles.
Mapped shifts in research topics and collaboration patterns over time.
Highlighted prominent research areas like cardiovascular health and neuroscience.
Abstract
This study presents a large-scale network dataset, NIH-MPINet, curated from NIH RePORTER and PubMed, characterizing collaboration among multiple Principal Investigators (multi-PIs) on NIH R01-equivalent grants from 2006 to 2023. The network characterizes 30,127 PIs as nodes and their collaborations on 86,743 NIH R01-equivalent grants as edges, spanning 888 recipient organizations and supported by 40 NIH Institutes and Centers. We also curated comprehensive metadata, including node-level features such as PI affiliation, alongside edge-level features comprising grant years, titles, and abstracts. Using these data, we constructed a PI collaboration network and identified 19 communities as well as 20 major research topics. Several collaboration communities showed distinct thematic profiles, such as cardiovascular health, cancer immunotherapy, neuroscience, and microbiome research, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
