Calibrated bootstrap for uncertainty quantification in regression models
Glenn Palmer, Siqi Du, Alexander Politowicz, Joshua Paul Emory, Xiyu, Yang, Anupraas Gautam, Grishma Gupta, Zhelong Li, Ryan Jacobs, Dane Morgan

TL;DR
This paper introduces a calibration method to improve the accuracy of bootstrap-based uncertainty estimates in regression models, demonstrating its effectiveness on synthetic and physical datasets.
Contribution
The paper proposes a novel calibration approach that significantly enhances the accuracy of bootstrap ensemble uncertainty estimates in regression tasks.
Findings
Calibration improves uncertainty estimates in synthetic data
Effective in physical datasets from Materials Science and Engineering
Applicable to a wide range of regression models
Abstract
Obtaining accurate estimates of machine learning model uncertainties on newly predicted data is essential for understanding the accuracy of the model and whether its predictions can be trusted. A common approach to such uncertainty quantification is to estimate the variance from an ensemble of models, which are often generated by the generally applicable bootstrap method. In this work, we demonstrate that the direct bootstrap ensemble standard deviation is not an accurate estimate of uncertainty and propose a calibration method to dramatically improve its accuracy. We demonstrate the effectiveness of this calibration method for both synthetic data and physical datasets from the field of Materials Science and Engineering. The approach is motivated by applications in physical and biological science but is quite general and should be applicable for uncertainty quantification in a wide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Fault Detection and Control Systems · Machine Learning and Data Classification
