A High-Performance External Validity Index for Clustering with a Large Number of Clusters
Mohammad Yasin Karbasian, Ramin Javadi

TL;DR
This paper presents SMBP, a scalable and efficient external validity index for clustering with many clusters, leveraging stable matching to reduce computational complexity while maintaining accuracy.
Contribution
Introduces SMBP, a novel stable matching-based clustering evaluation method that significantly improves computational efficiency for large-scale datasets.
Findings
SMBP achieves comparable accuracy to MWM.
SMBP reduces computational complexity to O(N^2).
SMBP is effective on large and unbalanced datasets.
Abstract
This paper introduces the Stable Matching Based Pairing (SMBP) algorithm, a high-performance external validity index for clustering evaluation in large-scale datasets with a large number of clusters. SMBP leverages the stable matching framework to pair clusters across different clustering methods, significantly reducing computational complexity to , compared to traditional Maximum Weighted Matching (MWM) with complexity. Through comprehensive evaluations on real-world and synthetic datasets, SMBP demonstrates comparable accuracy to MWM and superior computational efficiency. It is particularly effective for balanced, unbalanced, and large-scale datasets with a large number of clusters, making it a scalable and practical solution for modern clustering tasks. Additionally, SMBP is easily implementable within machine learning frameworks like PyTorch and TensorFlow, offering…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Text and Document Classification Technologies
