Selective inference after cross-validation
Joshua R. Loftus

TL;DR
This paper introduces a method for valid statistical inference after model selection via cross-validation, applicable to models like Lasso and forward stepwise, without needing error variance knowledge.
Contribution
It extends inference frameworks to models chosen by cross-validation, handling quadratic constraints and enabling valid post-selection inference without variance estimation.
Findings
Applicable to Lasso and forward stepwise models
Does not require knowledge of error variance
Provides computational methods with R package implementations
Abstract
This paper describes a method for performing inference on models chosen by cross-validation. When the test error being minimized in cross-validation is a residual sum of squares it can be written as a quadratic form. This allows us to apply the inference framework in Loftus et al. (2015) for models determined by quadratic constraints to the model that minimizes CV test error. Our only requirement on the model training pro- cedure is that its selection events are regions satisfying linear or quadratic constraints. This includes both Lasso and forward stepwise, which serve as our main examples throughout. We do not require knowledge of the error variance . The procedures described here are computationally intensive methods of selecting models adaptively and performing inference for the selected model. Implementations are available in an R package.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Modeling and Causal Inference · Optimal Experimental Design Methods
