Large Scale Correlation Clustering Optimization
Shai Bagon, Meirav Galun

TL;DR
This paper introduces new scalable optimization algorithms for correlation clustering that handle large datasets, providing theoretical insights and enabling applications like face identification and object segmentation.
Contribution
It offers a theoretical analysis of the correlation clustering functional and develops novel algorithms capable of large-scale optimization, outperforming existing methods.
Findings
Algorithms successfully handle over 100K variables.
Theoretical analysis justifies model-selection capability.
Applications include face identification and multi-object segmentation.
Abstract
Clustering is a fundamental task in unsupervised learning. The focus of this paper is the Correlation Clustering functional which combines positive and negative affinities between the data points. The contribution of this paper is two fold: (i) Provide a theoretic analysis of the functional. (ii) New optimization algorithms which can cope with large scale problems (>100K variables) that are infeasible using existing methods. Our theoretic analysis provides a probabilistic generative interpretation for the functional, and justifies its intrinsic "model-selection" capability. Furthermore, we draw an analogy between optimizing this functional and the well known Potts energy minimization. This analogy allows us to suggest several new optimization algorithms, which exploit the intrinsic "model-selection" capability of the functional to automatically recover the underlying number of clusters.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Advanced Clustering Algorithms Research · Advanced Image and Video Retrieval Techniques
