Four Algorithms for Correlation Clustering: A Survey

Jafar Jafarov

arXiv:2208.12636·cs.DS·August 29, 2022

Four Algorithms for Correlation Clustering: A Survey

Jafar Jafarov

PDF

Open Access

TL;DR

This paper surveys four approximation algorithms for correlation clustering on complete, unweighted graphs, highlighting their approximation ratios and contributions to efficiently partition objects based on pairwise similarity data.

Contribution

It provides a comprehensive overview of four key algorithms for correlation clustering, detailing their approximation guarantees and methodological differences.

Findings

01

The 2.06-approximation algorithm offers the best theoretical guarantee.

02

The algorithms vary significantly in approximation ratios, from 2.06 to 17429.

03

The survey clarifies the strengths and limitations of each approach.

Abstract

In the Correlation Clustering problem, we are given a set of objects with pairwise similarity information. Our aim is to partition these objects into clusters that match this information as closely as possible. More specifically, the pairwise information is given as a weighted graph $G$ with its edges labelled as ``similar" or ``dissimilar" by a binary classifier. The goal is to produce a clustering that minimizes the weight of ``disagreements": the sum of the weights of similar edges across clusters and dissimilar edges within clusters. In this exposition we focus on the case when $G$ is complete and unweighted. We explore four approximation algorithms for the Correlation Clustering problem under this assumption. In particular, we describe the following algorithms: (i) the $17429 -$ approximation algorithm by Bansal, Blum, and Chawla, (ii) the $4 -$ approximation algorithm by Charikar,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFacility Location and Emergency Management · Advanced Clustering Algorithms Research · Multi-Criteria Decision Making