# Diffusion $K$-means clustering on manifolds: provable exact recovery via   semidefinite relaxations

**Authors:** Xiaohui Chen, Yun Yang

arXiv: 1903.04416 · 2020-03-17

## TL;DR

This paper introduces a diffusion K-means clustering method on manifolds, utilizing semidefinite relaxations to achieve provable exact recovery under certain geometric conditions, suitable for complex non-linear data.

## Contribution

It proposes a novel diffusion K-means algorithm on manifolds with convex relaxation via SDP, including an adaptive version, and provides theoretical guarantees for exact recovery.

## Key findings

- Exact recovery under suitable manifold separability conditions
- Convex SDP relaxations efficiently solve the clustering problem
- Adaptive localized diffusion K-means adapts to local data structures

## Abstract

We introduce the {\it diffusion $K$-means} clustering method on Riemannian submanifolds, which maximizes the within-cluster connectedness based on the diffusion distance. The diffusion $K$-means constructs a random walk on the similarity graph with vertices as data points randomly sampled on the manifolds and edges as similarities given by a kernel that captures the local geometry of manifolds. The diffusion $K$-means is a multi-scale clustering tool that is suitable for data with non-linear and non-Euclidean geometric features in mixed dimensions. Given the number of clusters, we propose a polynomial-time convex relaxation algorithm via the semidefinite programming (SDP) to solve the diffusion $K$-means. In addition, we also propose a nuclear norm regularized SDP that is adaptive to the number of clusters. In both cases, we show that exact recovery of the SDPs for diffusion $K$-means can be achieved under suitable between-cluster separability and within-cluster connectedness of the submanifolds, which together quantify the hardness of the manifold clustering problem. We further propose the {\it localized diffusion $K$-means} by using the local adaptive bandwidth estimated from the nearest neighbors. We show that exact recovery of the localized diffusion $K$-means is fully adaptive to the local probability density and geometric structures of the underlying submanifolds.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.04416/full.md

## Figures

19 figures with captions in the complete paper: https://tomesphere.com/paper/1903.04416/full.md

## References

42 references — full list in the complete paper: https://tomesphere.com/paper/1903.04416/full.md

---
Source: https://tomesphere.com/paper/1903.04416