Network Clustering Approximation Algorithm Using One Pass Black Box Sampling
Thomas DuBois, Jennifer Golbeck, Aravind Srinivasan

TL;DR
This paper presents a fast, memory-efficient network clustering algorithm based on random edge sampling, providing strong approximation guarantees and applicability to large-scale networks and trust inference.
Contribution
It introduces a simple one-pass black box sampling algorithm for network clustering with provable approximation bounds applicable to large networks.
Findings
Algorithm achieves within a factor of two or three of optimal clustering.
Approximation guarantees hold for any clustering problem with a probability distribution.
Effective in social network trust inference scenarios.
Abstract
Finding a good clustering of vertices in a network, where vertices in the same cluster are more tightly connected than those in different clusters, is a useful, important, and well-studied task. Many clustering algorithms scale well, however they are not designed to operate upon internet-scale networks with billions of nodes or more. We study one of the fastest and most memory efficient algorithms possible - clustering based on the connected components in a random edge-induced subgraph. When defining the cost of a clustering to be its distance from such a random clustering, we show that this surprisingly simple algorithm gives a solution that is within an expected factor of two or three of optimal with either of two natural distance functions. In fact, this approximation guarantee works for any problem where there is a probability distribution on clusterings. We then examine the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Advanced Clustering Algorithms Research · Advanced Graph Neural Networks
