Average Sensitivity of Hierarchical $k$-Median Clustering
Shijie Li, Weiqiang He, Ruobing Bai, Pan Peng

TL;DR
This paper analyzes the average sensitivity of hierarchical k-median clustering algorithms, proposing a new efficient method with low sensitivity and high quality, while demonstrating the instability of some existing methods.
Contribution
It introduces an efficient hierarchical k-median clustering algorithm with proven low average sensitivity and robustness, addressing stability issues in existing methods.
Findings
Proposed algorithm has low average sensitivity and high clustering quality.
Single linkage and a variant of CLNSS show high sensitivity and instability.
Experimental results confirm robustness and effectiveness of the new algorithm.
Abstract
Hierarchical clustering is a widely used method for unsupervised learning with numerous applications. However, in the application of modern algorithms, the datasets studied are usually large and dynamic. If the hierarchical clustering is sensitive to small perturbations of the dataset, the usability of the algorithm will be greatly reduced. In this paper, we focus on the hierarchical -median clustering problem, which bridges hierarchical and centroid-based clustering while offering theoretical appeal, practical utility, and improved interpretability. We analyze the average sensitivity of algorithms for this problem by measuring the expected change in the output when a random data point is deleted. We propose an efficient algorithm for hierarchical -median clustering and theoretically prove its low average sensitivity and high clustering quality. Additionally, we show that single…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Face and Expression Recognition · Data Management and Algorithms
