Reproducibility in Learning
Russell Impagliazzo, Rex Lei, Toniann Pitassi, Jessica Sorrell

TL;DR
This paper formalizes the concept of reproducible algorithms in learning, demonstrating their feasibility for fundamental problems and exploring the tradeoffs involved, thereby advancing the understanding of reliable machine learning methods.
Contribution
It introduces a formal definition of reproducible learning algorithms, develops methods to make existing algorithms reproducible, and analyzes the associated tradeoffs and bounds.
Findings
Reproducible algorithms can be constructed for key learning tasks.
Reproducibility incurs a modest increase in sample complexity.
Tradeoffs between reproducibility and efficiency are nearly tight.
Abstract
We introduce the notion of a reproducible algorithm in the context of learning. A reproducible learning algorithm is resilient to variations in its samples -- with high probability, it returns the exact same output when run on two samples from the same underlying distribution. We begin by unpacking the definition, clarifying how randomness is instrumental in balancing accuracy and reproducibility. We initiate a theory of reproducible algorithms, showing how reproducibility implies desirable properties such as data reuse and efficient testability. Despite the exceedingly strong demand of reproducibility, there are efficient reproducible algorithms for several fundamental problems in statistics and learning. First, we show that any statistical query algorithm can be made reproducible with a modest increase in sample complexity, and we use this to construct reproducible algorithms for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Statistical Methods and Inference
