What Do Learned Models Measure?
Indr\.e \v{Z}liobait\.e

TL;DR
This paper emphasizes the importance of measurement stability in learned models, showing that standard evaluation metrics do not ensure consistent measurement functions across different contexts, which is crucial for scientific applications.
Contribution
The paper formalizes measurement stability as a new evaluation criterion for learned measurement functions and demonstrates its importance through theoretical analysis and a real-world case study.
Findings
Standard evaluation metrics do not guarantee measurement stability.
Models with similar predictive accuracy can produce different measurement functions.
Distribution shifts can cause significant measurement inconsistencies.
Abstract
In many scientific and data-driven applications, machine learning models are increasingly used as measurement instruments, rather than merely as predictors of predefined labels. When the measurement function is learned from data, the mapping from observations to quantities is determined implicitly by the training distribution and inductive biases, allowing multiple inequivalent mappings to satisfy standard predictive evaluation criteria. We formalize learned measurement functions as a distinct focus of evaluation and introduce measurement stability, a property capturing invariance of the measured quantity across admissible realizations of the learning process and across contexts. We show that standard evaluation criteria in machine learning, including generalization error, calibration, and robustness, do not guarantee measurement stability. Through a real-world case study, we show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Adversarial Robustness in Machine Learning · Gaussian Processes and Bayesian Inference
