Topological Data Analysis for Unsupervised Feature Selection in Large Scale Spatial Omics Data Sets
James Boyle, Gregory Hamm, Eleanor Williams, Robin JG Hartman, Magnus Soderburg, Ian Henry, and Michael Casey

TL;DR
This paper introduces a topological data analysis approach using persistent homology to quantify spatial gene expression structures, improving unsupervised feature selection in large-scale spatial omics data and providing new biological insights.
Contribution
It applies persistent homology for continuous spatial structure quantification, enabling improved unsupervised feature selection across various spatial omics modalities.
Findings
Identifies spatially variable genes in spatial transcriptomics data.
Provides biological insights into kidney disease and myocardial infarction.
Extends methodology to spatial metabolomics data.
Abstract
Spatial transcriptomics studies are becoming increasingly large and commonplace, necessitating simultaneous analysis of a large number of spatially resolved variables. Correspondingly, a diverse range of methodologies have been proposed to compare the spatial expression structure of genes. Here, we apply persistent homology, a method from topological data analysis, to produce a continuous quantification of spatial structure in a given gene's expression, and show how this can be used for downstream tasks such as spatially variable gene identification. We explore the unique advantages of topology for this task, deriving biologically meaningful insights into kidney disease and myocardial infarction using public spatial transcriptomics data. We also show how the non-parametric nature of homology enables our methodology to extend naturally to other spatial omics modalities, demonstrating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Metabolomics and Mass Spectrometry Studies
