Feature Selection with Conjunctions of Decision Stumps and Learning from Microarray Data
Mohak Shah, Mario Marchand, Jacques Corbeil

TL;DR
This paper introduces a new feature selection method using conjunctions of decision stumps that identifies small gene subsets with strong future performance guarantees, outperforming existing methods in microarray data classification.
Contribution
The paper proposes a novel learning algorithm based on conjunctions of decision stumps with theoretical performance bounds for gene expression data classification.
Findings
Finds smaller gene subsets with competitive accuracy
Provides tight risk guarantees on future performance
Outperforms existing approaches in microarray classification
Abstract
One of the objectives of designing feature selection learning algorithms is to obtain classifiers that depend on a small number of attributes and have verifiable future performance guarantees. There are few, if any, approaches that successfully address the two goals simultaneously. Performance guarantees become crucial for tasks such as microarray data analysis due to very small sample sizes resulting in limited empirical evaluation. To the best of our knowledge, such algorithms that give theoretical bounds on the future performance have not been proposed so far in the context of the classification of gene expression data. In this work, we investigate the premise of learning a conjunction (or disjunction) of decision stumps in Occam's Razor, Sample Compression, and PAC-Bayes learning settings for identifying a small subset of attributes that can be used to perform reliable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Evolutionary Algorithms and Applications · Machine Learning and Data Classification
