Topological Features In Cancer Gene Expression Data
Svetlana Lockwood, Bala Krishnamoorthy

TL;DR
This paper introduces a novel topological data analysis method for cancer gene expression data that identifies key genes and topological features linked to cancer, aiding in understanding and potential biomarker discovery.
Contribution
The paper presents a new algebraic topology-based approach for gene selection and topological feature detection in high-dimensional cancer data, revealing biologically relevant structures.
Findings
Identified topological holes correlating with cancer-related genes.
Selected small gene subsets capturing significant topological features.
Validated method across five diverse cancer datasets.
Abstract
We present a new method for exploring cancer gene expression data based on tools from algebraic topology. Our method selects a small relevant subset from tens of thousands of genes while simultaneously identifying nontrivial higher order topological features, i.e., holes, in the data. We first circumvent the problem of high dimensionality by dualizing the data, i.e., by studying genes as points in the sample space. Then we select a small subset of the genes as landmarks to construct topological structures that capture persistent, i.e., topologically significant, features of the data set in its first homology group. Furthermore, we demonstrate that many members of these loops have been implicated for cancer biogenesis in scientific literature. We illustrate our method on five different data sets belonging to brain, breast, leukemia, and ovarian cancers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
