TL;DR
This paper presents FAIR Jupyter, a knowledge graph that enhances the sharing, exploration, and analysis of a dataset on the reproducibility of Jupyter notebooks, improving data reusability and discoverability.
Contribution
It introduces a semantic knowledge graph for a complex dataset, enabling granular exploration and tailored queries to improve data FAIRness and reproducibility insights.
Findings
Enables detailed exploration of reproducibility data
Supports customized queries for research and education
Enhances dataset FAIRness and data quality communication
Abstract
The way in which data are shared can affect their utility and reusability. Here, we demonstrate how data that we had previously shared in bulk can be mobilized further through a knowledge graph that allows for much more granular exploration and interrogation. The original dataset is about the computational reproducibility of GitHub-hosted Jupyter notebooks associated with biomedical publications. It contains rich metadata about the publications, associated GitHub repositories and Jupyter notebooks, and the notebooks' reproducibility. We took this dataset, converted it into semantic triples and loaded these into a triple store to create a knowledge graph, FAIR Jupyter, that we made accessible via a web service. This enables granular data exploration and analysis through queries that can be tailored to specific use cases. Such queries may provide details about any of the variables from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
