Data Stream Classification using Random Feature Functions and Novel Method Combinations
Diego Marr\'on ([email protected]), Jesse Read, ([email protected]), Albert Bifet ([email protected]), and Nacho Navarro ([email protected])

TL;DR
This paper explores combining Hoeffding trees, k-nearest neighbors, and gradient descent with random feature functions for improved data stream classification, demonstrating the benefits of GPU implementation on large real-world datasets.
Contribution
It introduces novel combinations of existing methods with random feature functions and evaluates GPU-based implementations for scalable data stream classification.
Findings
Positive results for the proposed method combinations
GPU implementations enhance scalability and performance
Highlights promising future directions in data-stream classification
Abstract
Big Data streams are being generated in a faster, bigger, and more commonplace. In this scenario, Hoeffding Trees are an established method for classification. Several extensions exist, including high-performing ensemble setups such as online and leveraging bagging. Also, -nearest neighbors is a popular choice, with most extensions dealing with the inherent performance limitations over a potentially-infinite stream. At the same time, gradient descent methods are becoming increasingly popular, owing in part to the successes of deep learning. Although deep neural networks can learn incrementally, they have so far proved too sensitive to hyper-parameter options and initial conditions to be considered an effective `off-the-shelf' data-streams solution. In this work, we look at combinations of Hoeffding-trees, nearest neighbour, and gradient descent methods with a streaming…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Anomaly Detection Techniques and Applications · Water Systems and Optimization
