Feature Selection for Microarray Gene Expression Data using Simulated Annealing guided by the Multivariate Joint Entropy
Fernando Gonz\'alez, Llu\'is A. Belanche

TL;DR
This paper introduces a novel information-theoretic method for feature selection in microarray gene expression data, utilizing a new multivariate joint entropy calculation and a simulated annealing algorithm to identify relevant gene subsets efficiently.
Contribution
It presents a new low-complexity multivariate joint entropy measure and a specialized simulated annealing algorithm for effective gene subset selection in microarray data.
Findings
High classification performance achieved
Selected gene subsets are biologically meaningful
Algorithm demonstrates low computational complexity
Abstract
In this work a new way to calculate the multivariate joint entropy is presented. This measure is the basis for a fast information-theoretic based evaluation of gene relevance in a Microarray Gene Expression data context. Its low complexity is based on the reuse of previous computations to calculate current feature relevance. The mu-TAFS algorithm --named as such to differentiate it from previous TAFS algorithms-- implements a simulated annealing technique specially designed for feature subset selection. The algorithm is applied to the maximization of gene subset relevance in several public-domain microarray data sets. The experimental results show a notoriously high classification performance and low size subsets formed by biologically meaningful genes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Evolutionary Algorithms and Applications
