Improving statistical learning methods via features selection without replacement sampling and random projection
Sulaiman khan, Muhammad Ahmad, Fida Ullah, Carlos Aguilar Iba\~nez, Jos\'e Eduardo Valdez Rodriguez

TL;DR
This paper introduces a novel machine learning approach combining feature selection without replacement sampling and random projection to enhance classification accuracy in high-dimensional microarray cancer datasets, effectively reducing overfitting.
Contribution
It presents a new integrated method using FSWOR and projection techniques, along with statistical gene selection, to improve microarray data classification accuracy.
Findings
Achieved 96% classification accuracy, outperforming existing methods by 9.09%.
Reduced feature space from 54,675 to 20,890 genes using Kendall test.
Demonstrated effectiveness in high-dimensional gene expression analysis.
Abstract
Cancer is fundamentally a genetic disease characterized by genetic and epigenetic alterations that disrupt normal gene expression, leading to uncontrolled cell growth and metastasis. High-dimensional microarray datasets pose challenges for classification models due to the "small n, large p" problem, resulting in overfitting. This study makes three different key contributions: 1) we propose a machine learning-based approach integrating the Feature Selection Without Re-placement (FSWOR) technique and a projection method to improve classification accuracy. 2) We apply the Kendall statistical test to identify the most significant genes from the brain cancer mi-croarray dataset (GSE50161), reducing the feature space from 54,675 to 20,890 genes.3) we apply machine learning models using k-fold cross validation techniques in which our model incorpo-rates ensemble classifiers with LDA projection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Neural Networks and Applications
MethodsFeature Selection · Linear Discriminant Analysis
