Black-Box Model Confidence Sets Using Cross-Validation with High-Dimensional Gaussian Comparison
Nicholas Kissel, Jing Lei

TL;DR
This paper develops high-dimensional Gaussian comparison results for cross-validated risk estimates, providing theoretical support for constructing model confidence sets in scenarios with many models and tuning parameters.
Contribution
It introduces a novel high-dimensional Gaussian comparison framework for cross-validation, enabling valid inference when the number of models exceeds sample size.
Findings
Provides Gaussian comparison results for high-dimensional cross-validation
Supports the construction of model confidence sets in complex settings
Bridges stability-based CLT with high-dimensional Gaussian comparison
Abstract
We derive high-dimensional Gaussian comparison results for the standard -fold cross-validated risk estimates. Our results combine a recent stability-based argument for the low-dimensional central limit theorem of cross-validation with the high-dimensional Gaussian comparison framework for sums of independent random variables. These results give new insights into the joint sampling distribution of cross-validated risks in the context of model comparison and tuning parameter selection, where the number of candidate models and tuning parameters can be larger than the fitting sample size. As a consequence, our results provide theoretical support for a recent methodological development that constructs model confidence sets using cross-validation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Probability and Risk Models · Probabilistic and Robust Engineering Design
