Nonparametric Bayesian Two-Level Clustering for Subject-Level Single-Cell Expression Data
Qiuyu Wu, Xiangyu Luo

TL;DR
This paper introduces a nonparametric Bayesian model called SCSC that simultaneously clusters subjects and cell types in single-cell expression data, effectively handling heterogeneity and raw count data characteristics.
Contribution
It develops a novel joint clustering model that automatically determines the number of subject subgroups and cell types without prior specification, addressing a key gap in single-cell data analysis.
Findings
Successfully clusters subjects and cells in simulated data.
Accurately matches cell types across subjects.
Validates effectiveness on real multi-subject scRNA-seq data.
Abstract
The advent of single-cell sequencing opens new avenues for personalized treatment. In this paper, we address a two-level clustering problem of simultaneous subject subgroup discovery (subject level) and cell type detection (cell level) for single-cell expression data from multiple subjects. However, current statistical approaches either cluster cells without considering the subject heterogeneity or group subjects without using the single-cell information. To bridge the gap between cell clustering and subject grouping, we develop a nonparametric Bayesian model, Subject and Cell clustering for Single-Cell expression data (SCSC) model, to achieve subject and cell grouping simultaneously. SCSC does not need to prespecify the subject subgroup number or the cell type number. It automatically induces subject subgroup structures and matches cell types across subjects. Moreover, it directly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
