yProv4DV: Reproducible Data Visualization Scripts Out of the Box
Gabriele Padovani, Sandro Fiore

TL;DR
yProv4DV is a lightweight Python library that enhances reproducibility of data visualization scripts by automatically tracking inputs, outputs, and code, facilitating independent reproduction of scientific plots.
Contribution
It introduces a novel, minimal-intrusion tool that enables reproducible visualization scripts without extensive code modifications, filling a gap in current reproducibility solutions.
Findings
Enables full reproducibility of visualization scripts with a single call.
Tracks inputs, outputs, and source code automatically.
Fills a gap in reproducibility workflows for scientific plots.
Abstract
While results visualization is a critical phase to the communication of new academic results, plots are frequently shared without the complete combination of code, input data, execution context and outputs required to independently reproduce the resulting figures. Existing reproducibility solutions tend to focus on computational pipelines or workflow management systems, not covering script-based visualization practices commonly used by researchers and practitioners. Additionally, the minimalist nature of current Python data visualization libraries tend to speed up the creation of images, disincentivizing users from spending time integrating additional tools into these short scripts. This paper proposes yProv4DV, a library lightweight designed to enable reproducible data visualization scripts through the use of provenance information, minimizing the necessity for code modifications.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Data Visualization and Analytics · Research Data Management Practices
