TL;DR
This paper introduces SOS-SDP, an exact branch-and-bound algorithm for minimum sum-of-squares clustering that effectively solves real-world instances with up to 4000 data points by integrating SDP relaxations and constraints.
Contribution
The paper presents a novel exact solver combining SDP relaxations, cutting-plane methods, and constraint integration for large-scale MSSC problems.
Findings
Successfully solves real-world MSSC instances up to 4000 data points.
Integrates must-link and cannot-link constraints into the SDP-based branch-and-bound.
Reduces problem size at each branch level while preserving SDP structure.
Abstract
The minimum sum-of-squares clustering problem (MSSC) consists of partitioning observations into clusters in order to minimize the sum of squared distances from the points to the centroid of their cluster. In this paper, we propose an exact algorithm for the MSSC problem based on the branch-and-bound technique. The lower bound is computed by using a cutting-plane procedure where valid inequalities are iteratively added to the Peng-Wei SDP relaxation. The upper bound is computed with the constrained version of k-means where the initial centroids are extracted from the solution of the SDP relaxation. In the branch-and-bound procedure, we incorporate instance-level must-link and cannot-link constraints to express knowledge about which data points should or should not be grouped together. We manage to reduce the size of the problem at each level preserving the structure of the SDP…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
