Bad Communities with High Modularity
Athanasios Kehagias, Leonidas Pitsoulis

TL;DR
This paper critically examines Newman's modularity function, revealing its tendency to overestimate community numbers in certain graphs due to the influence of the null model term, which can favor overly balanced clusters.
Contribution
The paper demonstrates that modularity can overestimate community counts and constructs graphs where natural communities are not optimal under modularity maximization.
Findings
Modularity can overestimate the number of communities.
Constructed graphs show natural communities are not always optimal.
Null model term influences community detection outcomes.
Abstract
In this paper we discuss some problematic aspects of Newman's modularity function QN. Given a graph G, the modularity of G can be written as QN = Qf -Q0, where Qf is the intracluster edge fraction of G and Q0 is the expected intracluster edge fraction of the null model, i.e., a randomly connected graph with same expected degree distribution as G. It follows that the maximization of QN must accomodate two factors pulling in opposite directions: Qf favors a small number of clusters and Q0 favors many balanced (i.e., with approximately equal degrees) clusters. In certain cases the Q0 term can cause overestimation of the true cluster number; this is the opposite of the well-known under estimation effect caused by the "resolution limit" of modularity. We illustrate the overestimation effect by constructing families of graphs with a "natural" community structure which, however, does not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Topological and Geometric Data Analysis · Graph theory and applications
