Reproducibility Beyond Artifacts: Interactional Support for Collaborative Machine Learning
Zhiwei Li, Carl Kesselman

TL;DR
This paper argues that reproducibility in collaborative ML requires supporting ongoing interaction and shared understanding, not just capturing artifacts, by proposing a socio-technical system with an AI-mediated semantic interface.
Contribution
It introduces a two-layer ML management system that combines artifact infrastructure with interactional support to enhance reproducibility in collaborative settings.
Findings
Identifies interactional breakdowns in ML projects despite artifact traceability.
Proposes a socio-technical system to mediate coordination and explanation.
Reframes reproducibility as an ongoing socio-technical process.
Abstract
Machine learning (ML) reproducibility is often framed as a problem of incomplete artifact recording. This framing leads to systems that prioritize capturing datasets, code, configurations, and execution environments. However, in collaborative and interdisciplinary ML projects, reproducibility failures often arise not only from missing artifacts but from difficulties in interpreting prior work, aligning evolving components, and reconstructing experimental intent over time. Drawing on a 19-month deployment of a data-centric ML management system in a clinical research project, we identify recurring interactional breakdowns that persist despite comprehensive structural traceability. Based on these findings, we propose a two-layer socio-technical ML management system combining lifecycle-aware artifact infrastructure with an interactional layer designed to mediate coordination, explanation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
