Ensemble of Distributed Learners for Online Classification of Dynamic Data Streams
Luca Canzian, Yu Zhang, and Mihaela van der Schaar

TL;DR
This paper introduces a distributed online ensemble learning scheme for classifying dynamic data streams, demonstrating significant performance improvements over existing methods through rigorous theoretical bounds and empirical evaluation.
Contribution
It proposes a novel online ensemble algorithm for distributed data streams that adapts to data dynamics and provides theoretical misclassification bounds, with extensive empirical validation.
Findings
Performance gains of 34% to 71% over state-of-the-art methods
Theoretical bounds on misclassification probability that tend to zero under certain conditions
Effective handling of distributed, heterogeneous, and dynamic data sources
Abstract
We present an efficient distributed online learning scheme to classify data captured from distributed, heterogeneous, and dynamic data sources. Our scheme consists of multiple distributed local learners, that analyze different streams of data that are correlated to a common event that needs to be classified. Each learner uses a local classifier to make a local prediction. The local predictions are then collected by each learner and combined using a weighted majority rule to output the final prediction. We propose a novel online ensemble learning algorithm to update the aggregation rule in order to adapt to the underlying data dynamics. We rigorously determine a bound for the worst case misclassification probability of our algorithm which depends on the misclassification probabilities of the best static aggregation rule, and of the best local classifier. Importantly, the worst case…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Advanced Bandit Algorithms Research · Machine Learning and Data Classification
