Efficient Dynamic Clustering: Capturing Patterns from Historical Cluster Evolution
Binbin Gu, Saeed Kargar, Faisal Nawab

TL;DR
This paper introduces DynamicC, a machine learning-based system for efficient, accurate clustering in high-velocity dynamic environments by leveraging previous results and augmenting batch algorithms.
Contribution
It presents a novel dynamic clustering approach that combines machine learning with batch algorithms to improve speed while maintaining accuracy in evolving datasets.
Findings
Outperforms state-of-the-art methods in speed.
Maintains similar accuracy to batch algorithms.
Effective on real-world and synthetic datasets.
Abstract
Clustering aims to group unlabeled objects based on similarity inherent among them into clusters. It is important for many tasks such as anomaly detection, database sharding, record linkage, and others. Some clustering methods are taken as batch algorithms that incur a high overhead as they cluster all the objects in the database from scratch or assume an incremental workload. In practice, database objects are updated, added, and removed from databases continuously which makes previous results stale. Running batch algorithms is infeasible in such scenarios as it would incur a significant overhead if performed continuously. This is particularly the case for high-velocity scenarios such as ones in Internet of Things applications. In this paper, we tackle the problem of clustering in high-velocity dynamic scenarios, where the objects are continuously updated, inserted, and deleted.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Caching and Content Delivery · Machine Learning and ELM
