Exact Post Model Selection Inference for Marginal Screening
Jason D Lee, Jonathan E Taylor

TL;DR
This paper introduces an exact, non-asymptotic framework for post-model selection inference in linear regression, enabling valid confidence intervals and hypothesis tests after marginal screening without eigenvalue assumptions.
Contribution
It provides a novel exact distribution characterization for linear functions of responses conditioned on model selection, applicable to various selection procedures.
Findings
Exact distribution results for post-selection inference
Framework applicable to multiple selection procedures
Computationally efficient for large datasets
Abstract
We develop a framework for post model selection inference, via marginal screening, in linear regression. At the core of this framework is a result that characterizes the exact distribution of linear functions of the response , conditional on the model being selected (``condition on selection" framework). This allows us to construct valid confidence intervals and hypothesis tests for regression coefficients that account for the selection procedure. In contrast to recent work in high-dimensional statistics, our results are exact (non-asymptotic) and require no eigenvalue-like assumptions on the design matrix . Furthermore, the computational cost of marginal regression, constructing confidence intervals and hypothesis testing is negligible compared to the cost of linear regression, thus making our methods particularly suitable for extremely large datasets. Although we focus on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic and phenotypic traits in livestock · Statistical Methods and Inference · Statistical Methods and Bayesian Inference
