OneBatchPAM: A Fast and Frugal K-Medoids Algorithm
Antoine de Mathelin, Nicolas Enrique Cecchi, Fran\c{c}ois Deheeger,, Mathilde Mougeot, Nicolas Vayatis

TL;DR
This paper introduces OneBatchPAM, a fast and memory-efficient k-medoids algorithm that uses batch sampling to approximate clustering, significantly reducing computational complexity while maintaining high accuracy.
Contribution
The paper presents a novel batch-based local-search algorithm for k-medoids that guarantees similar performance to traditional methods with lower computational costs.
Findings
Achieves O(mn) complexity with m = O(log n)
Performs comparably to state-of-the-art methods
Reduces running time significantly on large datasets
Abstract
This paper proposes a novel k-medoids approximation algorithm to handle large-scale datasets with reasonable computational time and memory complexity. We develop a local-search algorithm that iteratively improves the medoid selection based on the estimation of the k-medoids objective. A single batch of size m << n provides the estimation, which reduces the required memory size and the number of pairwise dissimilarities computations to O(mn), instead of O(n^2) compared to most k-medoids baselines. We obtain theoretical results highlighting that a batch of size m = O(log(n)) is sufficient to guarantee, with strong probability, the same performance as the original local-search algorithm. Multiple experiments conducted on real datasets of various sizes and dimensions show that our algorithm provides similar performances as state-of-the-art methods such as FasterPAM and BanditPAM++ with a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsVehicle License Plate Recognition · Metaheuristic Optimization Algorithms Research
