Robust Information Criterion for Model Selection in Sparse High-Dimensional Linear Regression Models
Prakash B. Gohain, Magnus Jansson

TL;DR
This paper introduces EBIC-Robust, a new model selection criterion for high-dimensional linear regression that remains consistent across various data scales and SNR levels, outperforming existing criteria.
Contribution
The paper proposes EBIC-Robust, an improved model selection criterion that is invariant to data scaling and consistent in both large sample and high-SNR scenarios.
Findings
EBIC-Robust outperforms EBIC and EFIC in simulations.
EBIC-Robust is invariant to data scaling.
Theoretical proofs guarantee its consistency.
Abstract
Model selection in linear regression models is a major challenge when dealing with high-dimensional data where the number of available measurements (sample size) is much smaller than the dimension of the parameter space. Traditional methods for model selection such as Akaike information criterion, Bayesian information criterion (BIC) and minimum description length are heavily prone to overfitting in the high-dimensional setting. In this regard, extended BIC (EBIC), which is an extended version of the original BIC and extended Fisher information criterion (EFIC), which is a combination of EBIC and Fisher information criterion, are consistent estimators of the true model as the number of measurements grows very large. However, EBIC is not consistent in high signal-to-noise-ratio (SNR) scenarios where the sample size is fixed and EFIC is not invariant to data scaling resulting in unstable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Advanced Statistical Methods and Models
MethodsLinear Regression
