Bootstrap in High Dimension with Low Computation
Henry Lam, Zhenyuan Liu

TL;DR
This paper demonstrates that a minimal number of bootstrap resamples, even just one, can provide valid uncertainty quantification in high-dimensional settings, significantly reducing computational costs.
Contribution
It introduces a 'cheap' bootstrap approach that achieves valid coverage with very few resamples in high-dimensional problems, enabling scalable uncertainty quantification.
Findings
Valid coverage with as few as one resample in high dimensions
Theoretical support for low-resample bootstrap validity
Empirical validation showing competitive performance
Abstract
The bootstrap is a popular data-driven method to quantify statistical uncertainty, but for modern high-dimensional problems, it could suffer from huge computational costs due to the need to repeatedly generate resamples and refit models. We study the use of bootstraps in high-dimensional environments with a small number of resamples. In particular, we show that with a recent "cheap" bootstrap perspective, using a number of resamples as small as one could attain valid coverage even when the dimension grows closely with the sample size, thus strongly supporting the implementability of the bootstrap for large-scale problems. We validate our theoretical results and compare the performance of our approach with other benchmarks via a range of experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStatistical Methods and Inference · Gaussian Processes and Bayesian Inference · Probabilistic and Robust Engineering Design
