FAIR Jupyter: a knowledge graph approach to semantic sharing and   granular exploration of a computational notebook reproducibility dataset

Sheeba Samuel; Daniel Mietchen

arXiv:2404.12935·cs.CE·January 7, 2025

FAIR Jupyter: a knowledge graph approach to semantic sharing and granular exploration of a computational notebook reproducibility dataset

Sheeba Samuel, Daniel Mietchen

PDF

1 Repo

TL;DR

This paper presents FAIR Jupyter, a knowledge graph that enhances the sharing, exploration, and analysis of a dataset on the reproducibility of Jupyter notebooks, improving data reusability and discoverability.

Contribution

It introduces a semantic knowledge graph for a complex dataset, enabling granular exploration and tailored queries to improve data FAIRness and reproducibility insights.

Findings

01

Enables detailed exploration of reproducibility data

02

Supports customized queries for research and education

03

Enhances dataset FAIRness and data quality communication

Abstract

The way in which data are shared can affect their utility and reusability. Here, we demonstrate how data that we had previously shared in bulk can be mobilized further through a knowledge graph that allows for much more granular exploration and interrogation. The original dataset is about the computational reproducibility of GitHub-hosted Jupyter notebooks associated with biomedical publications. It contains rich metadata about the publications, associated GitHub repositories and Jupyter notebooks, and the notebooks' reproducibility. We took this dataset, converted it into semantic triples and loaded these into a triple store to create a knowledge graph, FAIR Jupyter, that we made accessible via a web service. This enables granular data exploration and analysis through queries that can be tailored to specific use cases. Such queries may provide details about any of the variables from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fusion-jena/fairjupyter
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.