Is your data alignable? Principled and interpretable alignability testing and integration of single-cell data
Rong Ma, Eric D. Sun, David Donoho, James Zou

TL;DR
This paper introduces SMAI, a spectral manifold alignment framework that provides a statistically rigorous, interpretable, and structure-preserving method for testing and integrating single-cell datasets, improving downstream biological analyses.
Contribution
SMAI offers the first principled statistical test for dataset alignability and enhances data integration with interpretability and structure preservation, outperforming existing methods.
Findings
SMAI outperforms existing alignment methods on benchmark datasets.
It improves downstream analyses like gene differential expression and spatial transcriptomics.
SMAI enables quantification of technical confounders in single-cell data.
Abstract
Single-cell data integration can provide a comprehensive molecular view of cells, and many algorithms have been developed to remove unwanted technical or biological variations and integrate heterogeneous single-cell datasets. Despite their wide usage, existing methods suffer from several fundamental limitations. In particular, we lack a rigorous statistical test for whether two high-dimensional single-cell datasets are alignable (and therefore should even be aligned). Moreover, popular methods can substantially distort the data during alignment, making the aligned data and downstream analysis difficult to interpret. To overcome these limitations, we present a spectral manifold alignment and inference (SMAI) framework, which enables principled and interpretable alignability testing and structure-preserving integration of single-cell data with the same type of features. SMAI provides a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Cell Image Analysis Techniques · Bioinformatics and Genomic Networks
