Chamfer-Linkage for Hierarchical Agglomerative Clustering
Kishen N Gowda, Willem Fletcher, MohammadHossein Bateni, Laxman Dhulipala, D Ellis Hershkowitz, Rajesh Jayaram, Jakub {\L}\k{a}cki

TL;DR
This paper introduces Chamfer-linkage, a new hierarchical clustering method using Chamfer distance, which consistently outperforms classical linkages in clustering quality across diverse datasets.
Contribution
The paper proposes Chamfer-linkage, a novel linkage function for HAC that is both theoretically efficient and empirically yields higher-quality clusters than traditional methods.
Findings
Chamfer-linkage can be implemented in $O(n^2)$ time.
It consistently produces higher-quality clusters than classical linkages.
Chamfer-linkage broadens the toolkit for hierarchical clustering in practice.
Abstract
Hierarchical Agglomerative Clustering (HAC) is a widely-used clustering method based on repeatedly merging the closest pair of clusters, where inter-cluster distances are determined by a linkage function. Unlike many clustering methods, HAC does not optimize a single explicit global objective; clustering quality is therefore primarily evaluated empirically, and the choice of linkage function plays a crucial role in practice. However, popular classical linkages, such as single-linkage, average-linkage and Ward's method show high variability across real-world datasets and do not consistently produce high-quality clusterings in practice. In this paper, we propose \emph{Chamfer-linkage}, a novel linkage function that measures the distance between clusters using the Chamfer distance, a popular notion of distance between point-clouds in machine learning and computer vision. We argue that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Data Quality and Management · Complex Network Analysis Techniques
