A generalised OMP algorithm for feature selection with application to gene expression data
Michail Tsagris, Zacharias Papadovasilakis, Kleanthi Lakiotaki and, Ioannis Tsamardinos

TL;DR
This paper introduces gOMP, a scalable and versatile feature selection algorithm that outperforms LASSO in various gene expression data analyses, including classification, regression, and survival prediction.
Contribution
The paper presents gOMP, a generalized and scalable version of Orthogonal Matching Pursuit, applicable to multiple outcome types and models, with theoretical advantages and superior performance.
Findings
gOMP outperforms LASSO in binary classification tasks.
gOMP is effective for regression and survival analysis.
gOMP is simple, easy to implement, and generalizable.
Abstract
Feature selection for predictive analytics is the problem of identifying a minimal-size subset of features that is maximally predictive of an outcome of interest. To apply to molecular data, feature selection algorithms need to be scalable to tens of thousands of available features. In this paper, we propose gOMP, a highly-scalable generalisation of the Orthogonal Matching Pursuit feature selection algorithm to several directions: (a) different types of outcomes, such as continuous, binary, nominal, and time-to-event, (b) different types of predictive models (e.g., linear least squares, logistic regression), (c) different types of predictive features (continuous, categorical), and (d) different, statistical-based stopping criteria. We compare the proposed algorithm against LASSO, a prototypical, widely used algorithm for high-dimensional data. On dozens of simulated datasets, as well…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Face and Expression Recognition · Neural Networks and Applications
MethodsFeature Selection
