Two sources of poor coverage of confidence intervals after model selection
Paul Kabaila, Rheanna Mainzer

TL;DR
This paper investigates two main reasons for the poor coverage of confidence intervals after model selection: incorrect model choice and data reuse in interval construction, highlighting their impact on statistical inference accuracy.
Contribution
The study compares the effects of model selection errors and data reuse on confidence interval coverage, providing insights into their relative importance and implications.
Findings
Incorrect model selection leads to undercoverage of confidence intervals.
Data reuse in interval construction exacerbates coverage issues.
Both factors significantly impact the reliability of post-model-selection inference.
Abstract
We compare the following two sources of poor coverage of post-model-selection confidence intervals: the preliminary data-based model selection sometimes chooses the wrong model and the data used to choose the model is re-used for the construction of the confidence interval.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
