A Study of Feature Selection and Extraction Algorithms for Cancer Subtype Prediction
Vaibhav Sinha, Siladitya Dash, Nazma Naskar, and Sk Md Mosaddek, Hossain

TL;DR
This paper evaluates feature selection algorithms for cancer subtype prediction, demonstrating that sequential application reduces computational costs and can enhance model performance in high-dimensional omics data.
Contribution
It introduces a sequential approach to feature selection that improves efficiency and predictive accuracy for cancer subtype classification.
Findings
Sequential feature selection reduces computational cost.
Dimension reduction can improve model accuracy.
Analysis supports the effectiveness of the proposed methods.
Abstract
In this work, we study and analyze different feature selection algorithms that can be used to classify cancer subtypes in case of highly varying high-dimensional data. We apply three different feature selection methods on five different types of cancers having two separate omics each. We show that the existing feature selection methods are computationally expensive when applied individually. Instead, we apply these algorithms sequentially which helps in lowering the computational cost and improving the predictive performance. We further show that reducing the number of features using some dimension reduction techniques can improve the performance of machine learning models in some cases. We support our findings through comprehensive data analysis and visualization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks
MethodsFeature Selection
