Feature reduction for machine learning on molecular features: The GeneScore
Alexander Denker, Anastasia Steshina, Theresa Grooss, Frank Ueckert,, Sylvia N\"urnberg

TL;DR
The paper introduces GeneScore, a feature reduction method that combines various molecular data types into a single score, improving cancer classification accuracy in biomedical machine learning applications.
Contribution
GeneScore is a novel, knowledge-based feature reduction technique that outperforms binary matrices in cancer classification tasks using diverse molecular data.
Findings
GeneScore outperforms binary matrices in classification accuracy.
Integrates multiple molecular data types into a single score.
Utilizes expert knowledge for feature reduction.
Abstract
We present the GeneScore, a concept of feature reduction for Machine Learning analysis of biomedical data. Using expert knowledge, the GeneScore integrates different molecular data types into a single score. We show that the GeneScore is superior to a binary matrix in the classification of cancer entities from SNV, Indel, CNV, gene fusion and gene expression data. The GeneScore is a straightforward way to facilitate state-of-the-art analysis, while making use of the available scientific knowledge on the nature of molecular data features used.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Machine Learning in Bioinformatics · Genomics and Phylogenetic Studies
