A Hybrid Ensemble Feature Selection Design for Candidate Biomarkers Discovery from Transcriptome Profiles
Felipe Colombelli, Thayne Woycinck Kowalski, Mariana Recamonde-Mendoza

TL;DR
This paper introduces a Hybrid Ensemble Feature Selection (Hyb-EFS) method that combines data and method perturbations to improve the stability and biological relevance of candidate biomarkers from transcriptome data, outperforming existing approaches.
Contribution
The paper proposes a novel Hybrid EFS design that enhances biomarker discovery stability and reproducibility by integrating perturbations at data and method levels, validated on cancer microarray datasets.
Findings
Hyb-EFS reduces performance variability across datasets.
Hyb-EFS improves stability over Het-EFS and Hom-EFS.
Hyb-EFS yields biologically plausible biomarkers with high cancer gene enrichment.
Abstract
The discovery of disease biomarkers from gene expression data has been greatly advanced by feature selection (FS) methods, especially using ensemble FS (EFS) strategies with perturbation at the data level (i.e., homogeneous, Hom-EFS) or method level (i.e., heterogeneous, Het-EFS). Here we proposed a Hybrid EFS (Hyb-EFS) design that explores both types of perturbation to improve the stability and the predictive power of candidate biomarkers. With this, Hyb-EFS aims to disrupt associations of good performance with a single dataset, single algorithm, or a specific combination of both, which is particularly interesting for better reproducibility of genomic biomarkers. We investigated the adequacy of our approach for microarray data related to four types of cancer, carrying out an extensive comparison with other ensemble and single FS approaches. Five FS methods were used in our experiments:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Cancer-related molecular mechanisms research · Machine Learning in Bioinformatics
MethodsFeature Selection
