Chameleon2++: An Efficient and Scalable Variant Of Chameleon Clustering
Priyanshu Singh, Kapil Ahuja

TL;DR
Chameleon2++ introduces a scalable hierarchical clustering algorithm that reduces complexity to O(n log n) by integrating approximate nearest neighbor search and multi-level graph partitioning, while improving clustering quality on large datasets.
Contribution
It proposes a novel combination of approximate k-NN search and multi-level graph partitioning to significantly enhance scalability and clustering quality in hierarchical clustering.
Findings
Reduces clustering time complexity to O(n log n)
Achieves an average of 4% improvement in clustering quality
Demonstrates scalability on large real-world datasets
Abstract
Hierarchical clustering remains a fundamental challenge in data mining, particularly when dealing with large-scale datasets where traditional approaches fail to scale effectively. Recent Chameleon-based algorithms - Chameleon2, M-Chameleon, and INNGS-Chameleon have proposed advanced strategies but they still suffer from computational complexity, especially for large datasets. With Chameleon2 as the base algorithm, we introduce Chameleon2++ that addresses this challenge. Our algorithm has three parts. First, Graph Generation - we propose an approximate -NN search instead of an exact one, specifically we integrate with the Annoy algorithm. This results in fast approximate nearest neighbor computation, significantly reducing the graph generation time. Second, Graph Partitioning - we propose use of a multi-level partitioning algorithm instead of a recursive bisection one.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression
MethodsALIGN
