TL;DR
This paper introduces efficient streaming algorithms for k-center clustering with outliers in sliding windows, achieving constant approximation with minimal memory and providing practical estimates of data spread.
Contribution
It presents the first algorithms for k-center clustering with outliers in sliding windows that are both memory-efficient and achieve a constant approximation ratio.
Findings
Algorithms achieve O(1) approximation ratio.
Memory usage is linear in k+z and logarithmic in window size.
Experimental results confirm practical viability.
Abstract
Metric -center clustering is a fundamental unsupervised learning primitive. Although widely used, this primitive is heavily affected by noise in the data, so that a more sensible variant seeks for the best solution that disregards a given number of points of the dataset, called outliers. We provide efficient algorithms for this important variant in the streaming model under the sliding window setting, where, at each time step, the dataset to be clustered is the window of the most recent data items. Our algorithms achieve approximation and, remarkably, require a working memory linear in and only logarithmic in . As a by-product, we show how to estimate the effective diameter of the window , which is a measure of the spread of the window points, disregarding a given fraction of noisy distances. We also provide experimental evidence of the practical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
