SECLEDS: Sequence Clustering in Evolving Data Streams via Multiple Medoids and Medoid Voting
Azqa Nadeem, Sicco Verwer

TL;DR
SECLEDS is a novel streaming clustering algorithm that efficiently handles evolving data streams with concept drift by using multiple medoids and a voting scheme, achieving high-quality clusters with reduced computations.
Contribution
It introduces SECLEDS, a streaming k-medoids variant with multiple medoids and medoid voting, capable of adapting to concept drift with constant memory.
Findings
SECLEDS achieves high-quality clustering comparable to offline methods.
It reduces distance computations by 83.7% compared to BanditPAM.
Outperforms baseline algorithms by 138.7% in streams with drift.
Abstract
Sequence clustering in a streaming environment is challenging because it is computationally expensive, and the sequences may evolve over time. K-medoids or Partitioning Around Medoids (PAM) is commonly used to cluster sequences since it supports alignment-based distances, and the k-centers being actual data items helps with cluster interpretability. However, offline k-medoids has no support for concept drift, while also being prohibitively expensive for clustering data streams. We therefore propose SECLEDS, a streaming variant of the k-medoids algorithm with constant memory footprint. SECLEDS has two unique properties: i) it uses multiple medoids per cluster, producing stable high-quality clusters, and ii) it handles concept drift using an intuitive Medoid Voting scheme for approximating cluster distances. Unlike existing adaptive algorithms that create new clusters for new concepts,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Time Series Analysis and Forecasting · Data Management and Algorithms
