A clusterability test for directed graphs
Mario R. Guarracino, Pierre Miasnikof, Alexander Y. Shestopaloff, Houyem Demni, Cristi\'an Bravo, Yuri Lawryshyn

TL;DR
This paper extends a statistical clusterability test to directed graphs without self loops, enabling efficient detection of unclusterable graphs in large networks through sampling, with high accuracy even on small samples.
Contribution
The authors adapt the $ ext{delta}$ test for directed graphs and demonstrate its effectiveness and robustness in large network analysis using minimal neighborhood sampling.
Findings
The $ ext{delta}$ test accurately detects unclusterable graphs.
The test performs well with as little as 1% neighborhood samples.
It remains robust despite deviations from original assumptions.
Abstract
In this article, we extend a statistical test of graph clusterability, the test, to directed graphs with no self loops. The test, originally designed for undirected graphs, is based on the premise that graphs with a clustered structure display a mean local density that is statistically higher than the graph's global density. We posit that graphs that do not meet this necessary (but not sufficient) condition for clusterability can be considered unsuited to clustering. In such cases, vertex clusters do not offer a meaningful summary of the broader graph. Additionally in this study, we aim to determine the optimal sample size (number of neighborhoods). Our test, designed for the analysis of large networks, is based on sampling subsets of neighborhoods/nodes. It is designed for cases where computing the density of every node's neighborhood is infeasible. Our results show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research
