Genomics Data Analysis via Spectral Shape and Topology
Erik J. Am\'ezquita, Farzana Nasrin, Kathleen M. Storey, and Masato, Yoshizawa

TL;DR
This paper introduces a novel workflow combining Mapper and differential gene expression analysis to reveal distinct subgroups in tumor RNA-seq data, uncovering new insights into lung cancer pathways.
Contribution
The paper develops a new approach integrating Mapper with statistical inference tools to analyze high-dimensional genomic data more effectively than existing methods.
Findings
Successfully separated tumor and healthy subjects using Mapper-based graphical structures.
Identified two distinct gene regulation pathways in tumor subgroups.
Demonstrated the superiority of Mapper over t-SNE in revealing tumor heterogeneity.
Abstract
Mapper, a topological algorithm, is frequently used as an exploratory tool to build a graphical representation of data. This representation can help to gain a better understanding of the intrinsic shape of high-dimensional genomic data and to retain information that may be lost using standard dimension-reduction algorithms. We propose a novel workflow to process and analyze RNA-seq data from tumor and healthy subjects integrating Mapper and differential gene expression. Precisely, we show that a Gaussian mixture approximation method can be used to produce graphical structures that successfully separate tumor and healthy subjects, and produce two subgroups of tumor subjects. A further analysis using DESeq2, a popular tool for the detection of differentially expressed genes, shows that these two subgroups of tumor cells bear two distinct gene regulations, suggesting two discrete paths for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Topological and Geometric Data Analysis · Gene expression and cancer classification
