Valid post-selection inference

Richard Berk; Lawrence Brown; Andreas Buja; Kai Zhang; Linda Zhao

arXiv:1306.1059·math.ST·June 6, 2013

Valid post-selection inference

Richard Berk, Lawrence Brown, Andreas Buja, Kai Zhang, Linda Zhao

PDF

TL;DR

This paper introduces a method for valid post-selection inference by using simultaneous inference to ensure reliable statistical conclusions regardless of the model selection process, even if the model is incorrect.

Contribution

It proposes a universal approach to post-selection inference that is valid under all model selection procedures by employing simultaneous inference techniques.

Findings

01

Provides a universally valid post-selection inference method.

02

Ensures validity even when the selected model is incorrect.

03

Less conservative than Scheffe protection.

Abstract

It is common practice in statistical data analysis to perform data-driven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides for tests and confidence intervals when the model has been chosen a priori. We propose to produce valid ``post-selection inference'' by reducing the problem to one of simultaneous inference and hence suitably widening conventional confidence and retention intervals. Simultaneity is required for all linear functions that arise as coefficient estimates in all submodels. By purchasing ``simultaneity insurance'' for all possible submodels, the resulting post-selection inference is rendered universally valid under all possible model selection procedures. This inference is therefore generally conservative for particular selection procedures, but it is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.