# Clustering under Local Stability: Bridging the Gap between Worst-Case   and Beyond Worst-Case Analysis

**Authors:** Maria-Florina Balcan, Colin White

arXiv: 1705.07157 · 2018-12-31

## TL;DR

This paper develops clustering algorithms that combine worst-case guarantees with stability-based performance, and introduces a local stability concept ensuring the recovery of stable clusters, advancing beyond traditional global stability assumptions.

## Contribution

It presents algorithms that maintain worst-case approximation guarantees while performing well on stable data, and introduces a local stability framework for recovering stable clusters.

## Key findings

- Algorithms inherit worst-case guarantees and perform well on stable data.
- Local stability ensures recovery of all stable optimal clusters.
- Strong results for $k$-median, $k$-means, and $k$-center under recent stability notions.

## Abstract

Recently, there has been substantial interest in clustering research that takes a beyond worst-case approach to the analysis of algorithms. The typical idea is to design a clustering algorithm that outputs a near-optimal solution, provided the data satisfy a natural stability notion. For example, Bilu and Linial (2010) and Awasthi et al. (2012) presented algorithms that output near-optimal solutions, assuming the optimal solution is preserved under small perturbations to the input distances. A drawback to this approach is that the algorithms are often explicitly built according to the stability assumption and give no guarantees in the worst case; indeed, several recent algorithms output arbitrarily bad solutions even when just a small section of the data does not satisfy the given stability notion.   In this work, we address this concern in two ways. First, we provide algorithms that inherit the worst-case guarantees of clustering approximation algorithms, while simultaneously guaranteeing near-optimal solutions when the data is stable. Our algorithms are natural modifications to existing state-of-the-art approximation algorithms. Second, we initiate the study of local stability, which is a property of a single optimal cluster rather than an entire optimal solution. We show our algorithms output all optimal clusters which satisfy stability locally. Specifically, we achieve strong positive results in our local framework under recent stability notions including metric perturbation resilience (Angelidakis et al. 2017) and robust perturbation resilience (Balcan and Liang 2012) for the $k$-median, $k$-means, and symmetric/asymmetric $k$-center objectives.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.07157/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1705.07157/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/1705.07157/full.md

---
Source: https://tomesphere.com/paper/1705.07157