STREAMLINE: Streaming Active Learning for Realistic Multi-Distributional Settings
Nathan Beck, Suraj Kothawade, Pradeep Shenoy, Rishabh Iyer

TL;DR
STREAMLINE is a novel streaming active learning framework designed to address multi-distributional data streams, improving model performance on rare but important data slices in real-world scenarios like image classification and object detection.
Contribution
It introduces a three-step approach using submodular measures to identify, budget, and select data slices, mitigating class imbalance in streaming data.
Findings
Improves accuracy on infrequent data slices by up to 5%.
Enhances mAP on object detection tasks by up to 8%.
Effectively handles multi-distributional streaming data.
Abstract
Deep neural networks have consistently shown great performance in several real-world use cases like autonomous vehicles, satellite imaging, etc., effectively leveraging large corpora of labeled training data. However, learning unbiased models depends on building a dataset that is representative of a diverse range of realistic scenarios for a given task. This is challenging in many settings where data comes from high-volume streams, with each scenario occurring in random interleaved episodes at varying frequencies. We study realistic streaming settings where data instances arrive in and are sampled from an episodic multi-distributional data stream. Using submodular information measures, we propose STREAMLINE, a novel streaming active learning framework that mitigates scenario-driven slice imbalance in the working labeled data via a three-step procedure of slice identification,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Machine Learning and Data Classification · Advanced Bandit Algorithms Research
