Ranked Sparsity: A Cogent Regularization Framework for Selecting and   Estimating Feature Interactions and Polynomials

Ryan A. Peterson; Joseph E. Cavanaugh

arXiv:2107.07594·stat.ME·January 28, 2022

Ranked Sparsity: A Cogent Regularization Framework for Selecting and Estimating Feature Interactions and Polynomials

Ryan A. Peterson, Joseph E. Cavanaugh

PDF

1 Repo

TL;DR

This paper introduces ranked sparsity, a regularization framework that improves feature interaction selection by requiring stronger evidence for inclusion, leading to more accurate, interpretable, and less overfitted models.

Contribution

It proposes the sparsity-ranked lasso (SRL) method that addresses covariate equipoise and enhances model selection for interactions and polynomials.

Findings

01

SRL outperforms competing methods in simulations

02

SRL produces more transparent models with fewer false interactions

03

SRL is fast and accurate in high-dimensional settings

Abstract

We explore and illustrate the concept of ranked sparsity, a phenomenon that often occurs naturally in modeling applications when an expected disparity exists in the quality of information between different feature sets. Its presence can cause traditional and modern model selection methods to fail because such procedures commonly presume that each potential parameter is equally worthy of entering into the final model - we call this presumption "covariate equipoise". However, this presumption does not always hold, especially in the presence of derived variables. For instance, when all possible interactions are considered as candidate predictors, the premise of covariate equipoise will often produce over-specified and opaque models. The sheer number of additional candidate variables grossly inflates the number of false discoveries in the interactions, resulting in unnecessarily complex and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

softwarecorner/2021-38-3
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.