Variable Selection with Second-Generation P-Values
Yi Zuo, Thomas G. Stewart, Jeffrey D. Blume

TL;DR
ProSGPV introduces a novel variable selection method using second-generation p-values with l0 penalization, achieving superior inference accuracy and competitive prediction performance, especially in high-dimensional and collinear data settings.
Contribution
It presents a new variable selection approach that balances inference and prediction, outperforming current standards in inference accuracy and robustness in complex data scenarios.
Findings
ProSGPV achieves the best rate of true model recovery.
It maintains performance under high collinearity and p > n scenarios.
ProSGPV outperforms SCAD, AL, and MC+ in inference tasks.
Abstract
Many statistical methods have been proposed for variable selection in the past century, but few balance inference and prediction tasks well. Here we report on a novel variable selection approach called Penalized regression with Second-Generation P-Values (ProSGPV). It captures the true model at the best rate achieved by current standards, is easy to implement in practice, and often yields the smallest parameter estimation error. The idea is to use an l0 penalization scheme with second-generation p-values (SGPV), instead of traditional ones, to determine which variables remain in a model. The approach yields tangible advantages for balancing support recovery, parameter estimation, and prediction tasks. The ProSGPV algorithm can maintain its good performance even when there is strong collinearity among features or when a high dimensional feature space with p > n is considered. We present…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Sparse and Compressive Sensing Techniques · Control Systems and Identification
