Hybrid gene selection approach using XGBoost and multi-objective genetic algorithm for cancer classification
Xiongshi Deng, Min Li, Shaobo Deng, Lei Wang

TL;DR
This paper introduces a two-stage gene selection method combining XGBoost and a multi-objective genetic algorithm to improve cancer classification accuracy using microarray data.
Contribution
It presents a novel hybrid approach that effectively reduces irrelevant genes and optimizes relevant gene subsets for better classification performance.
Findings
XGBoost-MOGA outperforms existing feature selection methods.
The approach improves accuracy, F-score, precision, and recall.
It is validated on 13 microarray datasets.
Abstract
Microarray gene expression data are often accompanied by a large number of genes and a small number of samples. However, only a few of these genes are relevant to cancer, resulting in signigicant gene selection challenges. Hence, we propose a two-stage gene selection approach by combining extreme gradient boosting (XGBoost) and a multi-objective optimization genetic algorithm (XGBoost-MOGA) for cancer classification in microarray datasets. In the first stage, the genes are ranked use an ensemble-based feature selection using XGBoost. This stage can effectively remove irrelevant genes and yield a group comprising the most relevant genes related to the class. In the second stage, XGBoost-MOGA searches for an optimal gene subset based on the most relevant genes's group using a multi-objective optimization genetic algorithm. We performed comprehensive experiments to compare XGBoost-MOGA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Machine Learning and ELM · Machine Learning in Bioinformatics
MethodsFeature Selection
