DynHAC: Fully Dynamic Approximate Hierarchical Agglomerative Clustering
Shangdi Yu, Laxman Dhulipala, Jakub {\L}\k{a}cki, Nikos Parotsidis

TL;DR
DynHAC is a novel algorithm that efficiently maintains an approximate hierarchical clustering in dynamic graphs, significantly reducing update time while improving clustering accuracy over existing methods.
Contribution
It introduces the first fully dynamic HAC algorithm for average-linkage that maintains a 1+ε approximation with provable guarantees.
Findings
Handles updates up to 423x faster than recomputation.
Achieves up to 0.21 higher NMI score than existing dynamic HAC algorithms.
Provides a scalable solution for real-world dynamic graph clustering.
Abstract
We consider the problem of maintaining a hierarchical agglomerative clustering (HAC) in the dynamic setting, when the input is subject to point insertions and deletions. We introduce DynHAC - the first dynamic HAC algorithm for the popular average-linkage version of the problem which can maintain a 1+\epsilon approximate solution. Our approach leverages recent structural results on (1+\epsilon)-approximate HAC to carefully identify the part of the clustering dendrogram that needs to be updated in order to produce a solution that is consistent with what a full recomputation from scratch would have output. We evaluate DynHAC on a number of real-world graphs. We show that DynHAC can handle each update up to 423x faster than what it would take to recompute the clustering from scratch. At the same time it achieves up to 0.21 higher NMI score than the state-of-the-art dynamic hierarchical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Data Management and Algorithms · Text and Document Classification Technologies
