Determining the Number of Communities in Sparse and Imbalanced Settings
Zhixuan Shao, Can M. Le

TL;DR
This paper introduces a spectral method using a novel network operator to accurately determine the number of communities in sparse and imbalanced networks, overcoming limitations of existing approaches.
Contribution
The authors develop a centered non-backtracking spectral operator that improves community detection in challenging network conditions, with theoretical and numerical validation.
Findings
Effective in ultra-sparse networks
Handles community size and density imbalance
Provides a reliable goodness-of-fit test
Abstract
Community structures represent a crucial aspect of network analysis, and various methods have been developed to identify these communities. However, a common hurdle lies in determining the number of communities K, a parameter that often requires estimation in practice. Existing approaches for estimating K face two notable challenges: the weak community signal present in sparse networks and the imbalance in community sizes or edge densities that result in unequal per-community expected degree. We propose a spectral method based on a novel network operator whose spectral properties effectively overcome both challenges. This operator is a refined version of the non-backtracking operator, adapted from a "centered" adjacency matrix. Its leading eigenvalues are more concentrated than those of the adjacency matrix for sparse networks, while they also demonstrate enhanced signal under imbalance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHIV, Drug Use, Sexual Risk · Census and Population Estimation · Pneumonia and Respiratory Infections
