Feature Selection Methods for Cost-Constrained Classification in Random Forests
Rudolf Jagdhuber, Michel Lang, J\"org Rahnenf\"uhrer

TL;DR
This paper introduces a fast, multivariate feature selection method called Shallow Tree Selection for cost-sensitive classification in Random Forests, addressing computational challenges and improving performance over baseline methods.
Contribution
The paper proposes a novel shallow tree-based feature selection method and adapts existing algorithms with a benefit-cost ratio criterion for cost-sensitive learning in Random Forests.
Findings
BCR-based methods outperform baselines in simulations and real data.
No single method is best for all settings; multiple approaches are recommended.
BCR criterion improves feature selection effectiveness in cost-sensitive scenarios.
Abstract
Cost-sensitive feature selection describes a feature selection problem, where features raise individual costs for inclusion in a model. These costs allow to incorporate disfavored aspects of features, e.g. failure rates of as measuring device, or patient harm, in the model selection process. Random Forests define a particularly challenging problem for feature selection, as features are generally entangled in an ensemble of multiple trees, which makes a post hoc removal of features infeasible. Feature selection methods therefore often either focus on simple pre-filtering methods, or require many Random Forest evaluations along their optimization path, which drastically increases the computational complexity. To solve both issues, we propose Shallow Tree Selection, a novel fast and multivariate feature selection method that selects features from small tree structures. Additionally, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Face and Expression Recognition
MethodsFeature Selection
