Sketch-and-solve approaches to k-means clustering by semidefinite   programming

Charles Clum; Dustin G. Mixon; Soledad Villar; Kaiying Xie

arXiv:2211.15744·cs.LG·November 30, 2022

Sketch-and-solve approaches to k-means clustering by semidefinite programming

Charles Clum, Dustin G. Mixon, Soledad Villar, Kaiying Xie

PDF

Open Access 1 Repo

TL;DR

This paper presents a sketch-and-solve method that accelerates semidefinite programming for k-means clustering, providing optimality certification or bounds without assumptions on data distribution.

Contribution

It introduces a data-driven approach that efficiently approximates k-means solutions and certifies their optimality or provides bounds, improving computational speed and reliability.

Findings

01

Accelerates semidefinite relaxation computations.

02

Certifies approximate optimality of k-means solutions.

03

Provides high-confidence lower bounds on k-means objective.

Abstract

We introduce a sketch-and-solve approach to speed up the Peng-Wei semidefinite relaxation of k-means clustering. When the data is appropriately separated we identify the k-means optimal clustering. Otherwise, our approach provides a high-confidence lower bound on the optimal k-means value. This lower bound is data-driven; it does not make any assumption on the data nor how it is generated. We provide code and an extensive set of numerical experiments where we use this approach to certify approximate optimality of clustering solutions obtained by k-means++.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kkylie/sketch-and-solve_kmeans
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Advanced Optimization Algorithms Research · Stochastic Gradient Optimization Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings