Algorithm for Finding Optimal Gene Sets in Microarray Prediction

J.M. Deutsch

arXiv:physics/0108011·physics.bio-ph·May 23, 2007·3 cites

Algorithm for Finding Optimal Gene Sets in Microarray Prediction

J.M. Deutsch

PDF

Open Access

TL;DR

This paper introduces a replication algorithm that identifies minimal gene sets for accurate cancer classification using microarray data, reducing the number of genes needed while maintaining perfect classification accuracy.

Contribution

The paper presents a novel replication algorithm that evolves ensembles of predictors to find optimal gene sets for cancer diagnosis, demonstrating significant gene reduction.

Findings

01

Reduced gene set from 96 to 15 for childhood cancers

02

Achieved perfect classification on test data

03

Validated method on leukemia and childhood cancer datasets

Abstract

Motivation: Microarray data has been recently been shown to be efficacious in distinguishing closely related cell types that often appear in the diagnosis of cancer. It is useful to determine the minimum number of genes needed to do such a diagnosis both for clinical use and to determine the importance of specific genes for cancer. Here a replication algorithm is used for this purpose. It evolves an ensemble of predictors, all using different combinations of genes to generate a set of optimal predictors. Results: We apply this method to the leukemia data of the Whitehead/MIT group that attempts to differentially diagnose two kinds of leukemia, and also to data of Khan et. al. to distinguish four different kinds of childhood cancers. In the latter case we were able to reduce the number of genes needed from 96 down to 15, while at the same time being able to perfectly classify all of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Genetics, Bioinformatics, and Biomedical Research