Decoding Patterns of Data Generation Teams for Clinical and Scientific Success: Insights from the Bridge2AI Talent Knowledge Graph
Jiawei Xu, Qingnan Xie, Meijun Liu, Zhandos Sembay, Swathi Thaker,, Pamela Payne-Foster, Jake Chen, Ying Ding

TL;DR
This study analyzes how team attributes influence the success of biomedical datasets, revealing that leadership, team composition, and diversity significantly impact scientific impact and clinical translation.
Contribution
It introduces a novel analysis linking team characteristics to dataset success using the Bridge2AI Talent Knowledge Graph and explainable AI methods.
Findings
PI leadership and academic strength predict dataset success
Team size and career age have mixed effects on impact and translation
Greater female representation correlates with higher dataset success
Abstract
High-quality biomedical datasets are essential for medical research and disease treatment innovation. The NIH-funded Bridge2AI project strives to facilitate such innovations by uniting top-tier, diverse teams to curate datasets designed for AI-driven biomedical research. We examined 1,699 dataset papers from the Nucleic Acids Research (NAR) database issues and the Bridge2AI Talent Knowledge Graph. By treating each paper's authors as a team, we explored the relationship between team attributes (team power and fairness) and dataset paper quality, measured by scientific impact (Relative Citation Ratio percentile) and clinical translation power (APT, likelihood of citation by clinical trials and guidelines). Utilizing the SHAP explainable AI framework, we identified correlations between team attributes and the success of dataset papers in both citation impact and clinical translation. Key…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Biomedical and Engineering Education · Biomedical Text Mining and Ontologies
