selectBoost: a general algorithm to enhance the performance of variable selection methods in correlated datasets
Isma\"il Aouadi, Nicolas Jung, Raphael Carapito, Laurent Vallat,, Seiamak Bahram, Myriam Maumy-Bertrand, Fr\'ed\'eric Bertrand

TL;DR
selectBoost is a versatile algorithm designed to enhance variable selection accuracy in high-dimensional, correlated datasets by leveraging intensive simulations and data correlation structures, applicable for improving existing methods or experimental planning.
Contribution
It introduces a general, simulation-based algorithm that boosts the precision of any variable selection method in correlated, high-dimensional data settings.
Findings
Improves variable selection precision in correlated datasets
Effective on both simulated and real data
Provides confidence indices for variable selection
Abstract
Motivation: With the growth of big data, variable selection has become one of the major challenges in statistics. Although many methods have been proposed in the literature their performance in terms of recall and precision are limited in a context where the number of variables by far exceeds the number of observations or in a high correlated setting. Results: In this article, we propose a general algorithm which improves the precision of any existing variable selection method. This algorithm is based on highly intensive simulations and takes into account the correlation structure of the data. Our algorithm can either produce a confidence index for variable selection or it can be used in an experimental design planning perspective. We demonstrate the performance of our algorithm on both simulated and real data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
