ZnTrack -- Data as Code
Fabian Zills, Moritz Sch\"afer, Samuel Tovey, Johannes K\"astner and, Christian Holm

TL;DR
ZnTrack is an open-source Python tool that simplifies data versioning, experiment tracking, and sharing by integrating data management into the coding workflow, embodying the 'Data as Code' concept for scalable, FAIR-compliant data handling.
Contribution
It introduces ZnTrack, a user-friendly Python package that extends version control to data and experiments, promoting the 'Data as Code' paradigm for better data management.
Findings
Enables easy tracking of experiment parameters and data workflows
Supports sharing and storing large datasets efficiently
Promotes FAIR data principles in data versioning
Abstract
The past decade has seen tremendous breakthroughs in computation and there is no indication that this will slow any time soon. Machine learning, large-scale computing resources, and increased industry focus have resulted in rising investments in computer-driven solutions for data management, simulations, and model generation. However, with this growth in computation has come an even larger expansion of data and with it, complexity in data storage, sharing, and tracking. In this work, we introduce ZnTrack, a Python-driven data versioning tool. ZnTrack builds upon established version control systems to provide a user-friendly and easy-to-use interface for tracking parameters in experiments, designing workflows, and storing and sharing data. From this ability to reduce large datasets to a simple Python script emerges the concept of Data as Code, a core component of the work presented here…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Advanced Data Storage Technologies · Computational Physics and Python Applications
MethodsFocus
