Semi-supervised Wrapper Feature Selection by Modeling Imperfect Labels
Vasilii Feofanov, Emilie Devijver, Massih-Reza Amini

TL;DR
This paper introduces a semi-supervised wrapper feature selection method that models label imperfections using a probabilistic error model, combining genetic algorithms and a novel multi-class bound to improve feature selection accuracy.
Contribution
It presents a new wrapper feature selection approach that explicitly accounts for label noise via a probabilistic model and integrates it with genetic algorithms.
Findings
Outperforms several state-of-the-art semi-supervised feature selection methods.
Effectively models label noise to improve feature subset quality.
Demonstrates robustness across multiple datasets.
Abstract
In this paper, we propose a new wrapper feature selection approach with partially labeled training examples where unlabeled observations are pseudo-labeled using the predictions of an initial classifier trained on the labeled training set. The wrapper is composed of a genetic algorithm for proposing new feature subsets, and an evaluation measure for scoring the different feature subsets. The selection of feature subsets is done by assigning weights to characteristics and recursively eliminating those that are irrelevant. The selection criterion is based on a new multi-class -bound that explicitly takes into account the mislabeling errors induced by the pseudo-labeling mechanism, using a probabilistic error model. Empirical results on different data sets show the effectiveness of our framework compared to several state-of-the-art semi-supervised feature selection approaches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Text and Document Classification Technologies · Anomaly Detection Techniques and Applications
MethodsFeature Selection
