Active Distance-Based Clustering using K-medoids
Mehrdad Ghadiri, Amin Aghaee, Mahdieh Soleymani Baghshah

TL;DR
This paper introduces an active distance-based k-medoids clustering method that efficiently estimates unknown distances using a small subset, enabling effective clustering with fewer distance computations.
Contribution
The paper presents a novel active clustering algorithm that reduces the need for complete distance matrices by estimating unknown distances through triangle inequality and active querying.
Findings
Effective clustering with limited distance data
Reduces computational cost in large datasets
Performs well on real-world and synthetic data
Abstract
k-medoids algorithm is a partitional, centroid-based clustering algorithm which uses pairwise distances of data points and tries to directly decompose the dataset with points into a set of disjoint clusters. However, k-medoids itself requires all distances between data points that are not so easy to get in many applications. In this paper, we introduce a new method which requires only a small proportion of the whole set of distances and makes an effort to estimate an upper-bound for unknown distances using the inquired ones. This algorithm makes use of the triangle inequality to calculate an upper-bound estimation of the unknown distances. Our method is built upon a recursive approach to cluster objects and to choose some points actively from each bunch of data and acquire the distances between these prominent points from oracle. Experimental results show that the proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Machine Learning and Algorithms · Advanced Clustering Algorithms Research
