Loss-guided Stability Selection

Tino Werner

arXiv:2202.04956·cs.LG·February 11, 2022

Loss-guided Stability Selection

Tino Werner

PDF

Open Access

TL;DR

This paper introduces a loss-guided stability selection method that improves sparse model selection by aligning it with the chosen loss function and validating with out-of-sample data, leading to more precise and stable models.

Contribution

It proposes a novel stability selection variant that incorporates loss function considerations and out-of-sample validation, enhancing model sparsity and accuracy in high-dimensional data.

Findings

01

Significant precision improvement over raw Boosting models.

02

Reduces issues of underfitting in noisy high-dimensional data.

03

Applicable to both regression and binary classification.

Abstract

In modern data analysis, sparse model selection becomes inevitable once the number of predictors variables is very high. It is well-known that model selection procedures like the Lasso or Boosting tend to overfit on real data. The celebrated Stability Selection overcomes these weaknesses by aggregating models, based on subsamples of the training data, followed by choosing a stable predictor set which is usually much sparser than the predictor sets from the raw models. The standard Stability Selection is based on a global criterion, namely the per-family error rate, while additionally requiring expert knowledge to suitably configure the hyperparameters. Since model selection depends on the loss function, i.e., predictor sets selected w.r.t. some particular loss function differ from those selected w.r.t. some other loss function, we propose a Stability Selection variant which respects the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Bayesian Modeling and Causal Inference · Fault Detection and Control Systems