Between Randomness and Arbitrariness: Some Lessons for Reliable Machine Learning at Scale
A. Feder Cooper

TL;DR
This paper explores reliable measurement in machine learning, addressing issues of arbitrariness, randomness, and evaluation of generative models, emphasizing the importance of interdisciplinary approaches for trustworthy AI systems.
Contribution
It introduces a comprehensive research framework for reliable measurement in ML, integrating law, policy, and technical methods across three key themes.
Findings
Quantifying and reducing arbitrariness in ML models
Improving scalability of uncertainty estimation without losing reliability
Developing evaluation methods for generative AI systems
Abstract
To develop rigorous knowledge about ML models -- and the systems in which they are embedded -- we need reliable measurements. But reliable measurement is fundamentally challenging, and touches on issues of reproducibility, scalability, uncertainty quantification, epistemology, and more. This dissertation addresses criteria needed to take reliability seriously: both criteria for designing meaningful metrics, and for methodologies that ensure that we can dependably and efficiently measure these metrics at scale and in practice. In doing so, this dissertation articulates a research vision for a new field of scholarship at the intersection of machine learning, law, and policy. Within this frame, we cover topics that fit under three different themes: (1) quantifying and mitigating sources of arbitrariness in ML, (2) taming randomness in uncertainty estimation and optimization algorithms, in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
MethodsDiffusion
