A Mathematical Theory for Clustering in Metric Spaces
Cheng-Shang Chang, Wanjiun Liao, Yu-Sheng Chen, and Li-Heng Liou

TL;DR
This paper introduces a new mathematical framework for clustering in metric spaces using a cohesion measure, proposing algorithms with convergence guarantees and a duality concept that broadens clustering theory beyond traditional positive semi-definite constraints.
Contribution
It defines a novel cohesion measure for data points, proposes new hierarchical and partitional clustering algorithms, and establishes a duality between distance and cohesion measures, expanding theoretical understanding.
Findings
The K-sets algorithm converges similarly to kernel K-means.
A duality exists between distance and cohesion measures.
The cohesion measure does not require positive semi-definiteness.
Abstract
Clustering is one of the most fundamental problems in data analysis and it has been studied extensively in the literature. Though many clustering algorithms have been proposed, clustering theories that justify the use of these clustering algorithms are still unsatisfactory. In particular, one of the fundamental challenges is to address the following question: What is a cluster in a set of data points? In this paper, we make an attempt to address such a question by considering a set of data points associated with a distance measure (metric). We first propose a new cohesion measure in terms of the distance measure. Using the cohesion measure, we define a cluster as a set of points that are cohesive to themselves. For such a definition, we show there are various equivalent statements that have intuitive explanations. We then consider the second question: How do we find clusters and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
