Covariate Selection Based on a Model-free Approach to Linear Regression with Exact Probabilities
Laurie Davies, Lutz D\"umbgen

TL;DR
This paper introduces a novel, model-free covariate selection method for linear regression that uses exact Gaussian P-values based on the Beta distribution, offering a simple, fast, and reliable alternative to traditional techniques.
Contribution
It proposes a new covariate selection approach using exact Gaussian P-values, eliminating the need for regularization, data splitting, or simulations, and demonstrating superior performance.
Findings
Method is simple, fast, and avoids overfitting.
Outperforms existing procedures in simulations and real data.
Provides asymptotic theoretical guarantees.
Abstract
In this paper we give a completely new approach to the problem of covariate selection in linear regression. A covariate or a set of covariates is included only if it is better in the sense of least squares than the same number of Gaussian covariates consisting of i.i.d. random variables. The Gaussian P-value is defined as the probability that the Gaussian covariates are better. It is given in terms of the Beta distribution, it is exact and it holds for all data making it model-free free. The covariate selection procedures require only a cut-off value for the Gaussian P-value: the default value in this paper is . The resulting procedures are very simple, very fast, do not overfit and require only least squares. In particular there is no regularization parameter, no data splitting, no use of simulations, no shrinkage and no post selection inference is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Fault Detection and Control Systems · Gaussian Processes and Bayesian Inference
