TL;DR
This paper introduces ranked sparsity, a regularization framework that improves feature interaction selection by requiring stronger evidence for inclusion, leading to more accurate, interpretable, and less overfitted models.
Contribution
It proposes the sparsity-ranked lasso (SRL) method that addresses covariate equipoise and enhances model selection for interactions and polynomials.
Findings
SRL outperforms competing methods in simulations
SRL produces more transparent models with fewer false interactions
SRL is fast and accurate in high-dimensional settings
Abstract
We explore and illustrate the concept of ranked sparsity, a phenomenon that often occurs naturally in modeling applications when an expected disparity exists in the quality of information between different feature sets. Its presence can cause traditional and modern model selection methods to fail because such procedures commonly presume that each potential parameter is equally worthy of entering into the final model - we call this presumption "covariate equipoise". However, this presumption does not always hold, especially in the presence of derived variables. For instance, when all possible interactions are considered as candidate predictors, the premise of covariate equipoise will often produce over-specified and opaque models. The sheer number of additional candidate variables grossly inflates the number of false discoveries in the interactions, resulting in unnecessarily complex and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
