Hierarchical clustering by aggregating representatives in sub-minimum-spanning-trees
Wen-Bo Xie, Zhen Liu, Jaideep Srivastava

TL;DR
This paper introduces a hierarchical clustering method that effectively identifies representative points using reciprocal nearest neighbor scoring within sub-minimum-spanning-trees, resulting in improved accuracy and scalability.
Contribution
The paper presents a novel hierarchical clustering algorithm that enhances representative point selection and demonstrates superior accuracy and efficiency over existing methods.
Findings
More accurate clustering results on UCI datasets
O(n log n) time complexity and O(log n) space complexity
Scalable to large datasets with reduced computational resources
Abstract
One of the main challenges for hierarchical clustering is how to appropriately identify the representative points in the lower level of the cluster tree, which are going to be utilized as the roots in the higher level of the cluster tree for further aggregation. However, conventional hierarchical clustering approaches have adopted some simple tricks to select the "representative" points which might not be as representative as enough. Thus, the constructed cluster tree is less attractive in terms of its poor robustness and weak reliability. Aiming at this issue, we propose a novel hierarchical clustering algorithm, in which, while building the clustering dendrogram, we can effectively detect the representative point based on scoring the reciprocal nearest data points in each sub-minimum-spanning-tree. Extensive experiments on UCI datasets show that the proposed algorithm is more accurate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Complex Network Analysis Techniques · Data Management and Algorithms
