TL;DR
This paper develops a principled statistical framework for selecting the most appropriate generative network models, including features like degree correction and community overlap, based on empirical data, to improve understanding of large-scale network structures.
Contribution
It introduces a model selection method using minimum description length and posterior odds ratios that accounts for model complexity and applies it to real networks.
Findings
Degree correction is almost universally supported by data.
Community overlap is rarely statistically justified.
The method effectively distinguishes between competing models.
Abstract
The effort to understand network systems in increasing detail has resulted in a diversity of methods designed to extract their large-scale structure from data. Unfortunately, many of these methods yield diverging descriptions of the same network, making both the comparison and understanding of their results a difficult challenge. A possible solution to this outstanding issue is to shift the focus away from ad hoc methods and move towards more principled approaches based on statistical inference of generative models. As a result, we face instead the more well-defined task of selecting between competing generative processes, which can be done under a unified probabilistic framework. Here, we consider the comparison between a variety of generative models including features such as degree correction, where nodes with arbitrary degrees can belong to the same group, and community overlap,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
