Black-Box Model Confidence Sets Using Cross-Validation with   High-Dimensional Gaussian Comparison

Nicholas Kissel; Jing Lei

arXiv:2211.04958·math.ST·November 15, 2023

Black-Box Model Confidence Sets Using Cross-Validation with High-Dimensional Gaussian Comparison

Nicholas Kissel, Jing Lei

PDF

Open Access

TL;DR

This paper develops high-dimensional Gaussian comparison results for cross-validated risk estimates, providing theoretical support for constructing model confidence sets in scenarios with many models and tuning parameters.

Contribution

It introduces a novel high-dimensional Gaussian comparison framework for cross-validation, enabling valid inference when the number of models exceeds sample size.

Findings

01

Provides Gaussian comparison results for high-dimensional cross-validation

02

Supports the construction of model confidence sets in complex settings

03

Bridges stability-based CLT with high-dimensional Gaussian comparison

Abstract

We derive high-dimensional Gaussian comparison results for the standard $V$ -fold cross-validated risk estimates. Our results combine a recent stability-based argument for the low-dimensional central limit theorem of cross-validation with the high-dimensional Gaussian comparison framework for sums of independent random variables. These results give new insights into the joint sampling distribution of cross-validated risks in the context of model comparison and tuning parameter selection, where the number of candidate models and tuning parameters can be larger than the fitting sample size. As a consequence, our results provide theoretical support for a recent methodological development that constructs model confidence sets using cross-validation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Probability and Risk Models · Probabilistic and Robust Engineering Design