The coverage probabililty of confidence intervals in regression after a preliminary F test
Paul Kabaila, Davide Farchione

TL;DR
This paper derives a new, computationally efficient formula to evaluate the minimum coverage probability of confidence intervals in regression after a preliminary F test, revealing potential severe undercoverage.
Contribution
It introduces an elegant formula for the coverage probability of naive confidence intervals post-F test, facilitating practical assessment of their adequacy in regression analysis.
Findings
Naive 95% confidence interval coverage can be as low as 8.46%.
The new formula simplifies computation of coverage probabilities for any s.
Application to covariance analysis demonstrates the inadequacy of common practices.
Abstract
Consider a linear regression model with regression parameter beta=(beta_1,..., beta_p) and independent normal errors. Suppose the parameter of interest is theta = a^T beta, where a is specified. Define the s-dimensional parameter vector tau = C^T beta - t, where C and t are specified. Suppose that we carry out a preliminary F test of the null hypothesis H_0: tau = 0 against the alternative hypothesis H_1: tau not equal to 0. It is common statistical practice to then construct a confidence interval for theta with nominal coverage 1-alpha, using the same data, based on the assumption that the selected model had been given to us a priori(as the true model). We call this the naive 1-alpha confidence interval for theta. This assumption is false and it may lead to this confidence interval having minimum coverage probability far below 1-alpha, making it completely inadequate. Our aim is to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Statistical Methods and Inference · Advanced Statistical Process Monitoring
