Overview of streaming-data algorithms

T Soni Madhulatha

arXiv:1203.2000·cs.DB·March 12, 2012·2 cites

Overview of streaming-data algorithms

T Soni Madhulatha

PDF

Open Access

TL;DR

This paper reviews algorithms for clustering data streams, emphasizing the importance of single-pass, memory-efficient methods suitable for real-time analysis of unbounded, high-velocity data from various sources.

Contribution

It provides an overview of recent developments in data stream clustering algorithms, highlighting their applications and the challenges of processing massive, unbounded data efficiently.

Findings

01

Single-pass algorithms are essential for real-time data stream clustering.

02

Clustering helps in understanding data characteristics for predictive modeling.

03

Applications include sensor data analysis and financial data processing.

Abstract

Due to recent advances in data collection techniques, massive amounts of data are being collected at an extremely fast pace. Also, these data are potentially unbounded. Boundless streams of data collected from sensors, equipments, and other data sources are referred to as data streams. Various data mining tasks can be performed on data streams in search of interesting patterns. This paper studies a particular data mining task, clustering, which can be used as the first step in many knowledge discovery processes. By grouping data streams into homogeneous clusters, data miners can learn about data characteristics which can then be developed into classification models for new data or predictive models for unknown events. Recent research addresses the problem of data-stream mining to deal with applications that require processing huge amounts of data such as sensor data analysis and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Time Series Analysis and Forecasting · Advanced Clustering Algorithms Research