Mining Drifting Data Streams on a Budget: Combining Active Learning with Self-Labeling
{\L}ukasz Korycki, Bartosz Krawczyk

TL;DR
This paper introduces a hybrid framework combining active learning and self-labeling to efficiently mine drifting data streams under limited labeling budgets, addressing challenges of non-stationary data and resource constraints.
Contribution
It proposes a novel, adaptable hybrid approach that reduces label requirements in streaming data mining by integrating intelligent instance selection and semi-supervised learning.
Findings
Effective in handling concept drift in real-world data streams
Reduces labeling costs while maintaining high classification accuracy
Applicable with various learning algorithms
Abstract
Mining data streams poses a number of challenges, including the continuous and non-stationary nature of data, the massive volume of information to be processed and constraints put on the computational resources. While there is a number of supervised solutions proposed for this problem in the literature, most of them assume that access to the ground truth (in form of class labels) is unlimited and such information can be instantly utilized when updating the learning system. This is far from being realistic, as one must consider the underlying cost of acquiring labels. Therefore, solutions that can reduce the requirements for ground truth in streaming scenarios are required. In this paper, we propose a novel framework for mining drifting data streams on a budget, by combining information coming from active learning and self-labeling. We introduce several strategies that can take advantage…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Machine Learning and Data Classification · Advanced Bandit Algorithms Research
