Reproducible Domain-Specific Knowledge Graphs in the Life Sciences: a Systematic Literature Review
Samira Babalou, Sheeba Samuel, Birgitta K\"onig-Ries

TL;DR
This systematic review highlights the significant reproducibility challenges in domain-specific knowledge graphs within the life sciences, revealing that only a tiny fraction are reproducible, underscoring the need for improved practices.
Contribution
The paper provides a comprehensive analysis of reproducibility in domain-specific KGs, comparing 250 KGs across 19 domains and identifying critical gaps and challenges.
Findings
Only 3.2% of KGs provide publicly available source code.
Just one system out of eight reproducible KGs passed the assessment.
Reproducible KGs constitute only 0.4% of published domain-specific KGs.
Abstract
Knowledge graphs (KGs) are widely used for representing and organizing structured knowledge in diverse domains. However, the creation and upkeep of KGs pose substantial challenges. Developing a KG demands extensive expertise in data modeling, ontology design, and data curation. Furthermore, KGs are dynamic, requiring continuous updates and quality control to ensure accuracy and relevance. These intricacies contribute to the considerable effort required for their development and maintenance. One critical dimension of KGs that warrants attention is reproducibility. The ability to replicate and validate KGs is fundamental for ensuring the trustworthiness and sustainability of the knowledge they represent. Reproducible KGs not only support open science by allowing others to build upon existing knowledge but also enhance transparency and reliability in disseminating information. Despite the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Biomedical Text Mining and Ontologies · Bioinformatics and Genomic Networks
MethodsOntology
