dagger: A Python Framework for Reproducible Machine Learning Experiment Orchestration
Michela Paganini, Jessica Zosa Forde

TL;DR
Dagger is a Python framework designed to facilitate reproducible and reusable orchestration of complex, multi-stage machine learning experiments, addressing challenges in tracking experimental provenance.
Contribution
It introduces a novel framework that simplifies managing complex experiment workflows and ensures reproducibility in machine learning research.
Findings
Enables reproducible experiment pipelines
Simplifies tracking of experimental provenance
Supports multi-stage machine learning workflows
Abstract
Many research directions in machine learning, particularly in deep learning, involve complex, multi-stage experiments, commonly involving state-mutating operations acting on models along multiple paths of execution. Although machine learning frameworks provide clean interfaces for defining model architectures and unbranched flows, burden is often placed on the researcher to track experimental provenance, that is, the state tree that leads to a final model configuration and result in a multi-stage experiment. Originally motivated by analysis reproducibility in the context of neural network pruning research, where multi-stage experiment pipelines are common, we present dagger, a framework to facilitate reproducible and reusable experiment orchestration. We describe the design principles of the framework and example usage.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning in Materials Science · Machine Learning and Data Classification
MethodsPruning
