TL;DR
S-RASTER is a fast, memory-efficient streaming clustering algorithm that adapts the RASTER density-based method for evolving data streams, enabling real-time cluster detection with minimal precision loss.
Contribution
It introduces S-RASTER, a novel adaptation of RASTER for streaming data, maintaining linear time and constant memory per time step while effectively handling evolving clusters.
Findings
S-RASTER is at least 50% faster than competing algorithms.
It achieves good qualitative clustering results on standard metrics.
The algorithm is well-suited for real-world periodic clustering scenarios.
Abstract
Contraction Clustering (RASTER) is a single-pass algorithm for density-based clustering of 2D data. It can process arbitrary amounts of data in linear time and in constant memory, quickly identifying approximate clusters. It also exhibits good scalability in the presence of multiple CPU cores. RASTER exhibits very competitive performance compared to standard clustering algorithms, but at the cost of decreased precision. Yet, RASTER is limited to batch processing and unable to identify clusters that only exist temporarily. In contrast, S-RASTER is an adaptation of RASTER to the stream processing paradigm that is able to identify clusters in evolving data streams. This algorithm retains the main benefits of its parent algorithm, i.e. single-pass linear time cost and constant memory requirements for each discrete time step within a sliding window. The sliding window is efficiently pruned,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
