Near-Optimal $k$-Clustering in the Sliding Window Model

David P. Woodruff; Peilin Zhong; Samson Zhou

arXiv:2311.00642·cs.DS·November 2, 2023·1 cites

Near-Optimal $k$-Clustering in the Sliding Window Model

David P. Woodruff, Peilin Zhong, Samson Zhou

PDF

Open Access

TL;DR

This paper introduces a near-optimal algorithm for $(k,z)$-clustering in the sliding window model, achieving a $(1+ ext{ε})$-approximation with significantly improved space complexity, and develops an online coreset data structure with theoretical bounds.

Contribution

It presents the first near-optimal $(1+ ext{ε})$-approximation algorithm for $(k,z)$-clustering in the sliding window model and introduces an online coreset data structure with proven complexity bounds.

Findings

01

Achieves near-optimal space complexity for clustering in sliding window model.

02

Develops an online coreset with provable size bounds.

03

Shows online coreset construction is strictly harder than offline.

Abstract

Clustering is an important technique for identifying structural information in large-scale data analysis, where the underlying dataset may be too large to store. In many applications, recent data can provide more accurate information and thus older data past a certain time is expired. The sliding window model captures these desired properties and thus there has been substantial interest in clustering in the sliding window model. In this paper, we give the first algorithm that achieves near-optimal $(1 + ε)$ -approximation to $(k, z)$ -clustering in the sliding window model, where $z$ is the exponent of the distance function in the cost. Our algorithm uses $\frac{k}{m i n ( ε ^{4} , ε ^{2 + z} )} polylog \frac{n Δ}{ε}$ words of space when the points are from $[Δ]^{d}$ , thus significantly improving on works by Braverman et. al. (SODA 2016), Borassi…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Clustering Algorithms Research · Face and Expression Recognition · Topological and Geometric Data Analysis