Parallel Hierarchical Agglomerative Clustering in Low Dimensions
MohammadHossein Bateni, Laxman Dhulipala, Willem Fletcher, Kishen N Gowda, D Ellis Hershkowitz, Rajesh Jayaram, Jakub {\L}\k{a}cki

TL;DR
This paper develops efficient parallel algorithms for approximate hierarchical clustering with non-monotone linkage functions like centroid and Ward's in low dimensions, and proves hardness results in high dimensions.
Contribution
It introduces NC algorithms for low-dimensional HAC with non-monotone linkages and establishes complexity bounds and hardness results across dimensions.
Findings
NC algorithms for low-dimensional HAC with non-monotone linkages
Hierarchy height is poly(log n) for constant-approximate HAC in low dimensions
HAC with these linkages is CC-hard in arbitrary dimensions
Abstract
Hierarchical Agglomerative Clustering (HAC) is an extensively studied and widely used method for hierarchical clustering in based on repeatedly merging the closest pair of clusters according to an input linkage function . Highly parallel (i.e., NC) algorithms are known for -approximate HAC (where near-minimum rather than minimum pairs are merged) for certain linkage functions that monotonically increase as merges are performed. However, no such algorithms are known for many important but non-monotone linkage functions such as centroid and Ward's linkage. In this work, we show that a general class of non-monotone linkage functions -- which include centroid and Ward's distance -- admit efficient NC algorithms for -approximate HAC in low dimensions. Our algorithms are based on a structural result which may be of independent interest: the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
