CAS Condensed and Accelerated Silhouette: An Efficient Method for Determining the Optimal K in K-Means Clustering
Krishnendu Das, Sumit Gupta, Awadhesh Kumar

TL;DR
This paper introduces an efficient method combining the Condensed Silhouette with statistical techniques to determine the optimal number of clusters in K-Means, significantly reducing computation time while maintaining accuracy.
Contribution
It proposes a novel accelerated silhouette-based approach with statistical enhancements for optimal K selection, improving speed and scalability in high-dimensional clustering tasks.
Findings
Achieves up to 99% faster execution on high-dimensional data
Maintains clustering precision comparable to traditional methods
Suitable for real-time and resource-constrained environments
Abstract
Clustering is a critical component of decision-making in todays data-driven environments. It has been widely used in a variety of fields such as bioinformatics, social network analysis, and image processing. However, clustering accuracy remains a major challenge in large datasets. This paper presents a comprehensive overview of strategies for selecting the optimal value of k in clustering, with a focus on achieving a balance between clustering precision and computational efficiency in complex data environments. In addition, this paper introduces improvements to clustering techniques for text and image data to provide insights into better computational performance and cluster validity. The proposed approach is based on the Condensed Silhouette method, along with statistical methods such as Local Structures, Gap Statistics, Class Consistency Ratio, and a Cluster Overlap Index CCR and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Bayesian Methods and Mixture Models · Customer churn and segmentation
