Distance-based mutual congestion feature selection with genetic algorithm for high-dimensional medical datasets
Hossein Nematzadeh, Joseph Mani, Zahra Nematzadeh, Ebrahim Akbari, and, Radziah Mohamad

TL;DR
This paper presents a hybrid feature selection method combining Distance-based Mutual Congestion and a genetic algorithm with adaptive rates, specifically designed for high-dimensional medical datasets, improving prediction accuracy.
Contribution
It introduces a novel filter method DMC that considers feature values and response distribution, combined with an adaptive genetic algorithm for effective feature subset selection.
Findings
Outperforms recent feature selection methods on medical datasets
Effectively reduces multicollinearity among selected features
Improves prediction accuracy in binary classification tasks
Abstract
Feature selection poses a challenge in small-sample high-dimensional datasets, where the number of features exceeds the number of observations, as seen in microarray, gene expression, and medical datasets. There isn't a universally optimal feature selection method applicable to any data distribution, and as a result, the literature consistently endeavors to address this issue. One recent approach in feature selection is termed frequency-based feature selection. However, existing methods in this domain tend to overlook feature values, focusing solely on the distribution in the response variable. In response, this paper introduces the Distance-based Mutual Congestion (DMC) as a filter method that considers both the feature values and the distribution of observations in the response variable. DMC sorts the features of datasets, and the top 5% are retained and clustered by KMeans to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducational Technology and Assessment · Advanced Computing and Algorithms
MethodsFeature Selection
