Optimal Parallel Algorithms for Dendrogram Computation and Single-Linkage Clustering
Laxman Dhulipala, Xiaojun Dong, Kishen N Gowda, Yan Gu

TL;DR
This paper introduces new parallel algorithms for efficiently computing Single-Linkage Dendrograms, significantly improving speed over traditional methods, and enabling handling of billion-scale trees in practical applications.
Contribution
The paper presents novel parallel algorithms for SLD computation with theoretical guarantees and practical efficiency, surpassing existing work in speed and scalability.
Findings
Achieved up to 150x speedup over Union-Find algorithms.
Developed algorithms with work complexity O(n log h) and depth complexity O(log^2 n log^2 h).
Enabled fast processing of billion-scale trees.
Abstract
Computing a Single-Linkage Dendrogram (SLD) is a key step in the classic single-linkage hierarchical clustering algorithm. Given an input edge-weighted tree , the SLD of is a binary dendrogram that summarizes the clusterings obtained by contracting the edges of in order of weight. Existing algorithms for computing the SLD all require work where . Furthermore, to the best of our knowledge no prior work provides a parallel algorithm obtaining non-trivial speedup for this problem. In this paper, we design faster parallel algorithms for computing SLDs both in theory and in practice based on new structural results about SLDs. In particular, we obtain a deterministic output-sensitive parallel algorithm based on parallel tree contraction that requires work and depth, where is the height of the output SLD. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms
