
TL;DR
This paper introduces a loss-guided stability selection method that improves sparse model selection by aligning it with the chosen loss function and validating with out-of-sample data, leading to more precise and stable models.
Contribution
It proposes a novel stability selection variant that incorporates loss function considerations and out-of-sample validation, enhancing model sparsity and accuracy in high-dimensional data.
Findings
Significant precision improvement over raw Boosting models.
Reduces issues of underfitting in noisy high-dimensional data.
Applicable to both regression and binary classification.
Abstract
In modern data analysis, sparse model selection becomes inevitable once the number of predictors variables is very high. It is well-known that model selection procedures like the Lasso or Boosting tend to overfit on real data. The celebrated Stability Selection overcomes these weaknesses by aggregating models, based on subsamples of the training data, followed by choosing a stable predictor set which is usually much sparser than the predictor sets from the raw models. The standard Stability Selection is based on a global criterion, namely the per-family error rate, while additionally requiring expert knowledge to suitably configure the hyperparameters. Since model selection depends on the loss function, i.e., predictor sets selected w.r.t. some particular loss function differ from those selected w.r.t. some other loss function, we propose a Stability Selection variant which respects the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Modeling and Causal Inference · Fault Detection and Control Systems
