Review and Evaluation of Feature Selection Algorithms in Synthetic Problems
L.A. Belanche, F.F. Gonz\'alez

TL;DR
This paper systematically evaluates various feature selection algorithms on synthetic datasets, analyzing their accuracy and efficiency in relation to data relevance, redundancy, and size, under controlled experimental conditions.
Contribution
It introduces a new evaluation measure for feature selection algorithms and provides a comprehensive experimental comparison on synthetic problems.
Findings
Algorithms vary in accuracy depending on data relevance and redundancy.
The evaluation measure effectively correlates with known optimal solutions.
Results inform best practices for feature selection in synthetic scenarios.
Abstract
The main purpose of Feature Subset Selection is to find a reduced subset of attributes from a data set described by a feature set. The task of a feature selection algorithm (FSA) is to provide with a computational solution motivated by a certain definition of relevance or by a reliable evaluation measure. In this paper several fundamental algorithms are studied to assess their performance in a controlled experimental scenario. A measure to evaluate FSAs is devised that computes the degree of matching between the output given by a FSA and the known optimal solutions. An extensive experimental study on synthetic problems is carried out to assess the behaviour of the algorithms in terms of solution accuracy and size as a function of the relevance, irrelevance, redundancy and size of the data samples. The controlled experimental conditions facilitate the derivation of better-supported and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Fuzzy Logic and Control Systems · Machine Learning and Data Classification
