Using Stochastic Block Models for Community Detection: The issue of edge-connectivity
The-Anh Vu-Le, Minhyuk Park, Ian Chen, George Chacko, Tandy Warnow

TL;DR
This paper investigates the connectivity issues in communities detected by stochastic block models (SBMs), demonstrating that many methods produce disconnected communities and proposing Well-Connected Clusters (WCC) as an effective solution.
Contribution
The study extends the analysis of community connectivity issues in SBM clustering to various software and models, and shows WCC improves accuracy and scalability across different methods.
Findings
All tested SBM methods produce disconnected communities.
Graph-tool's degree-corrected SBM improves connectivity over PySBM.
WCC enhances clustering accuracy and scales to large networks.
Abstract
A relevant, sometimes overlooked, quality criterion for communities in graphs is that they should be well-connected in addition to being edge-dense. Prior work has shown that leading community detection methods can produce poorly-connected communities, and some even produce internally disconnected communities. A recent study by Park et al. in Complex Networks and their Applications 2024 showed that this problem is evident in clusterings from three Stochastic Block Models (SBMs) in graph-tool, a popular software package. To address this issue, Park et al. presented a simple technique, Well-Connected Clusters (WCC), that repeatedly finds and removes small edge cuts of size at most in clusters, where is the number of nodes in the cluster, and showed that treatment of graph-tool SBM clusterings with WCC improves accuracy. Here we examine the question of cluster connectivity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
