TL;DR
This paper introduces three topologically motivated multiscale methods for unsupervised feature selection in single-cell transcriptomics, capturing both discrete and continuous gene expression patterns across multiple scales.
Contribution
The paper presents novel mathematical techniques—eigenscores, MLS, and PRQ—for identifying biologically relevant genes considering complex data geometry and scale, improving upon traditional clustering-based methods.
Findings
Validated methods on published datasets, detecting known and new biologically meaningful genes.
Provided multidimensional gene rankings and visualizations of gene relationships.
Demonstrated the ability to separate genes involved in bifurcation processes like pseudo-time.
Abstract
Analysis of single-cell transcriptomics often relies on clustering cells and then performing differential gene expression (DGE) to identify genes that vary between these clusters. These discrete analyses successfully determine cell types and markers; however, continuous variation within and between cell types may not be detected. We propose three topologically motivated mathematical methods for unsupervised feature selection that consider discrete and continuous transcriptional patterns on an equal footing across multiple scales simultaneously. Eigenscores () rank signals or genes based on their correspondence to low-frequency intrinsic patterning in the data using the spectral decomposition of the Laplacian graph. The multiscale Laplacian score (MLS) is an unsupervised method for locating relevant scales in data and selecting the genes that are coherently expressed at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFeature Selection
