Conditional predictive inference post model selection
Hannes Leeb

TL;DR
This paper provides a finite-sample analysis of predictive inference after model selection in high-dimensional regression, demonstrating that certain prediction intervals are approximately valid and short under weak conditions.
Contribution
It introduces a finite-sample framework for predictive inference post model selection in complex, high-dimensional settings without regularity assumptions.
Findings
Prediction intervals are approximately valid and short with high probability.
Results hold uniformly over all data-generating processes considered.
Applicable to various predictive inference procedures beyond intervals, like threshold tests.
Abstract
We give a finite-sample analysis of predictive inference procedures after model selection in regression with random design. The analysis is focused on a statistically challenging scenario where the number of potentially important explanatory variables can be infinite, where no regularity conditions are imposed on unknown parameters, where the number of explanatory variables in a "good" model can be of the same order as sample size and where the number of candidate models can be of larger order than sample size. The performance of inference procedures is evaluated conditional on the training sample. Under weak conditions on only the number of candidate models and on their complexity, and uniformly over all data-generating processes under consideration, we show that a certain prediction interval is approximately valid and short with high probability in finite samples, in the sense that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
