Sparse Partitioning Around Medoids
Lars Lenssen, Erich Schubert

TL;DR
This paper introduces a scalable, sparse, and asymmetric variant of the Partitioning Around Medoids algorithm, optimized for large, sparse graph data, with a method to determine the optimal number of medoids during the process.
Contribution
It proposes a new scalable approach for sparse and asymmetric medoid clustering, including a method to determine the number of medoids dynamically.
Findings
Scalable to larger problems using sparsity exploitation.
Effective on graph data such as road networks.
Demonstrated usefulness in electrical engineering applications.
Abstract
Partitioning Around Medoids (PAM, k-Medoids) is a popular clustering technique to use with arbitrary distance functions or similarities, where each cluster is represented by its most central object, called the medoid or the discrete median. In operations research, this family of problems is also known as facility location problem (FLP). FastPAM recently introduced a speedup for large k to make it applicable for larger problems, but the method still has a runtime quadratic in N. In this chapter, we discuss a sparse and asymmetric variant of this problem, to be used for example on graph data such as road networks. By exploiting sparsity, we can avoid the quadratic runtime and memory requirements, and make this method scalable to even larger problems, as long as we are able to build a small enough graph of sufficient connectivity to perform local optimization. Furthermore, we consider…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
