An information-theoretic derivation of min-cut based clustering
Anil Raj, Chris H. Wiggins

TL;DR
This paper derives a theoretical foundation linking min-cut clustering to information theory, specifically the Information Bottleneck method, providing insights into the optimality and approximation of clustering heuristics.
Contribution
It introduces an information-theoretic perspective to min-cut clustering, connecting it to the Information Bottleneck framework and analyzing its effectiveness on different graph types.
Findings
Min-cut heuristics can be approximated by the rate of loss of predictive information.
Optimal information-theoretic and min-cut partitions coincide on graphs with community structure.
The approach generalizes min-cut clustering using information theory principles.
Abstract
Min-cut clustering, based on minimizing one of two heuristic cost-functions proposed by Shi and Malik, has spawned tremendous research, both analytic and algorithmic, in the graph partitioning and image segmentation communities over the last decade. It is however unclear if these heuristics can be derived from a more general principle facilitating generalization to new problem settings. Motivated by an existing graph partitioning framework, we derive relationships between optimizing relevance information, as defined in the Information Bottleneck method, and the regularized cut in a K-partitioned graph. For fast mixing graphs, we show that the cost functions introduced by Shi and Malik can be well approximated as the rate of loss of predictive information about the location of random walkers on the graph. For graphs generated from a stochastic algorithm designed to model community…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Advanced Clustering Algorithms Research · Caching and Content Delivery
