Multiple Hypothesis Testing To Estimate The Number Of Communities in Stochastic Block Models
Chetkar Jha, Mingyao Li, Ian Barnett

TL;DR
This paper introduces a new statistical method combining likelihood-based network extraction and sequential multiple testing to accurately estimate the number of communities in stochastic block models, especially for noisy scRNA-seq data.
Contribution
It presents a novel sequential multiple testing approach for estimating community numbers in SBMs, with proven consistency and competitive performance on benchmark datasets.
Findings
SMT method is consistent under moderate sparsity.
Our approach outperforms existing methods on benchmark datasets.
Application to real scRNA-seq data reveals meaningful cell subgroups.
Abstract
Clustering of single-cell RNA sequencing (scRNA-seq) datasets can give key insights into the biological functions of cells. Therefore, it is not surprising that network-based community detection methods (one of the better clustering methods) are increasingly being used for the clustering of scRNA-seq datasets. The main challenge in implementing network-based community detection methods for scRNA-seq datasets is that these methods \emph{apriori} require the true number of communities or blocks for estimating the community memberships. Although there are existing methods for estimating the number of communities, they are not suitable for noisy scRNA-seq datasets. Moreover, we require an appropriate method for extracting suitable networks from scRNA-seq datasets. For addressing these issues, we present a two-fold solution: i) a simple likelihood-based approach for extracting stochastic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Bayesian Methods and Mixture Models · Complex Network Analysis Techniques
