Optimizing model-agnostic Random Subspace ensembles
V\^an Anh Huynh-Thu, Pierre Geurts

TL;DR
This paper introduces a model-agnostic ensemble method based on a parametric Random Subspace approach, optimizing feature selection probabilities via gradient descent for improved performance and interpretability.
Contribution
It proposes a novel, differentiable, and automatically tunable feature sampling method for ensembles, eliminating the need for manual hyper-parameter tuning.
Findings
Automatically tuned feature randomization improves ensemble performance.
Feature importance scores are derived from optimized feature selection probabilities.
The method seamlessly incorporates regularization for feature importance constraints.
Abstract
This paper presents a model-agnostic ensemble approach for supervised learning. The proposed approach is based on a parametric version of Random Subspace, in which each base model is learned from a feature subset sampled according to a Bernoulli distribution. Parameter optimization is performed using gradient descent and is rendered tractable by using an importance sampling approach that circumvents frequent re-training of the base models after each gradient descent step. The degree of randomization in our parametric Random Subspace is thus automatically tuned through the optimization of the feature selection probabilities. This is an advantage over the standard Random Subspace approach, where the degree of randomization is controlled by a hyper-parameter. Furthermore, the optimized feature selection probabilities can be interpreted as feature importance scores. Our algorithm can also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Data Classification · Gaussian Processes and Bayesian Inference
