Two sources of poor coverage of confidence intervals after model   selection

Paul Kabaila; Rheanna Mainzer

arXiv:1711.01739·math.ST·February 20, 2019

Two sources of poor coverage of confidence intervals after model selection

Paul Kabaila, Rheanna Mainzer

PDF

TL;DR

This paper investigates two main reasons for the poor coverage of confidence intervals after model selection: incorrect model choice and data reuse in interval construction, highlighting their impact on statistical inference accuracy.

Contribution

The study compares the effects of model selection errors and data reuse on confidence interval coverage, providing insights into their relative importance and implications.

Findings

01

Incorrect model selection leads to undercoverage of confidence intervals.

02

Data reuse in interval construction exacerbates coverage issues.

03

Both factors significantly impact the reliability of post-model-selection inference.

Abstract

We compare the following two sources of poor coverage of post-model-selection confidence intervals: the preliminary data-based model selection sometimes chooses the wrong model and the data used to choose the model is re-used for the construction of the confidence interval.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.