Guaranteed Recovery of Unambiguous Clusters

Kayvon Mazooji; Ilan Shomorony

arXiv:2501.13093·cs.IT·May 9, 2025

Guaranteed Recovery of Unambiguous Clusters

Kayvon Mazooji, Ilan Shomorony

PDF

Open Access 1 Repo

TL;DR

This paper introduces an information-theoretic framework to determine when a clustering is unambiguous and proposes an algorithm that guarantees recovery of such clusters, especially in complex scenarios with density variations and separated high-density regions.

Contribution

It provides a formal characterization of clustering ambiguity and develops an algorithm that guarantees recovery of unambiguous clusters, improving performance on complex datasets.

Findings

01

Algorithm effectively recovers unambiguous clusters

02

Requires minimal parameter tuning

03

Outperforms existing methods on many datasets

Abstract

Clustering is often a challenging problem because of the inherent ambiguity in what the "correct" clustering should be. Even when the number of clusters $K$ is known, this ambiguity often still exists, particularly when there is variation in density among different clusters, and clusters have multiple relatively separated regions of high density. In this paper we propose an information-theoretic characterization of when a $K$ -clustering is ambiguous, and design an algorithm that recovers the clustering whenever it is unambiguous. This characterization formalizes the situation when two high density regions within a cluster are separable enough that they look more like two distinct clusters than two truly distinct clusters in the $K$ -clustering. The algorithm first identifies $K$ partial clusters (or "seeds") using a density-based approach, and then adds unclustered points to the initial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kmazooji/Minimal-Seed-Expansion
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEconomic Policies and Impacts · Economic theories and models