A PAC-Bayesian Analysis of Graph Clustering and Pairwise Clustering
Yevgeny Seldin

TL;DR
This paper introduces a PAC-Bayesian framework for graph clustering, providing theoretical guarantees and an algorithm that balances data fit with information preservation, validated by real-world experiments.
Contribution
It adapts PAC-Bayesian analysis to graph clustering, offering a new theoretical bound and an effective minimization algorithm for practical applications.
Findings
The PAC-Bayesian bound is reasonably tight.
The algorithm performs well on real-life problems.
The approach balances empirical fit with mutual information.
Abstract
We formulate weighted graph clustering as a prediction problem: given a subset of edge weights we analyze the ability of graph clustering to predict the remaining edge weights. This formulation enables practical and theoretical comparison of different approaches to graph clustering as well as comparison of graph clustering with other possible ways to model the graph. We adapt the PAC-Bayesian analysis of co-clustering (Seldin and Tishby, 2008; Seldin, 2009) to derive a PAC-Bayesian generalization bound for graph clustering. The bound shows that graph clustering should optimize a trade-off between empirical data fit and the mutual information that clusters preserve on the graph nodes. A similar trade-off derived from information-theoretic considerations was already shown to produce state-of-the-art results in practice (Slonim et al., 2005; Yom-Tov and Slonim, 2009). This paper supports…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Advanced Clustering Algorithms Research · Advanced Graph Neural Networks
