Precision Cancer Classification and Biomarker Identification from mRNA Gene Expression via Dimensionality Reduction and Explainable AI
Farzana Tabassum, Sabrina Islam, Siana Rizwan, Masrur Sobhan, Tasnim, Ahmed, Sabbir Ahmed, and Tareque Mohmud Chowdhury

TL;DR
This paper introduces a pipeline that combines dimensionality reduction and explainable AI to accurately classify 33 cancer types from mRNA gene expression data and identify relevant biomarkers.
Contribution
It presents a novel approach that reduces high-dimensional gene expression data to 500 features while maintaining high classification accuracy and providing biological insights.
Findings
Achieved 96.61% classification accuracy.
Identified cancer-specific genes using only 500 features.
Demonstrated biological relevance through DGE analysis.
Abstract
Gene expression analysis is a critical method for cancer classification, enabling precise diagnoses through the identification of unique molecular signatures associated with various tumors. Identifying cancer-specific genes from gene expression values enables a more tailored and personalized treatment approach. However, the high dimensionality of mRNA gene expression data poses challenges for analysis and data extraction. This research presents a comprehensive pipeline designed to accurately identify 33 distinct cancer types and their corresponding gene sets. It incorporates a combination of normalization and feature selection techniques to reduce dataset dimensionality effectively while ensuring high performance. Notably, our pipeline successfully identifies a substantial number of cancer-specific genes using a reduced feature set of just 500, in contrast to using the full dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Machine Learning and Data Classification · AI in cancer detection
MethodsSparse Evolutionary Training · Feature Selection
