$K-$means with learned metrics

Pablo Groisman; Matthieu Jonckheere; Jordan Serres; Mariela Sued

arXiv:2603.14601·math.ST·March 20, 2026

$K-$means with learned metrics

Pablo Groisman, Matthieu Jonckheere, Jordan Serres, Mariela Sued

PDF

Open Access

TL;DR

This paper develops a unified theoretical framework for the consistency of $k$-means clustering when both the measure and the metric are unknown and estimated, with applications to various metric learning methods.

Contribution

It proves continuity and stability of $k$-means in the measured Gromov-Hausdorff topology, enabling new consistency results for several metric learning estimators.

Findings

01

Established stability of $k$-means with respect to measured Gromov-Hausdorff topology.

02

Proved consistency for $k$-means based on Isomap, diffusion, and Wasserstein distances.

03

Extended results to applications like first passage percolation and discrete length space approximations.

Abstract

We study the Fr\'echet $k -$ means of a metric measure space when both the measure and the distance are unknown and have to be estimated. We prove a general result that states that the $k -$ means are continuous with respect to the measured Gromov-Hausdorff topology. In this situation, we also prove a stability result for the Voronoi clusters they determine. We do not assume uniqueness of the set of $k -$ means, but when it is unique, the results are stronger. This framework provides a unified approach to proving consistency for a wide range of metric learning procedures. As concrete applications, we obtain new consistency results for several important estimators that were previously unestablished, even when $k = 1$ . These include $k -$ means based on: (i) Isomap and Fermat geodesic distances on manifolds, (ii) difussion distances, (iii) Wasserstein distances computed with respect to learned…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopological and Geometric Data Analysis · Morphological variations and asymmetry · Stochastic Gradient Optimization Techniques