On weighted uncertainty sampling in active learning
Vinay Jethava

TL;DR
This paper investigates probabilistic uncertainty sampling in active learning, highlighting its computational efficiency, ease of implementation, and effectiveness in balancing exploration and representation, especially with biased initial labeled data.
Contribution
It demonstrates that weighted uncertainty sampling is beneficial in active learning, offering a practical, efficient approach that improves performance with biased starting datasets.
Findings
Weighted sampling is computationally cheap.
It can be implemented in a streaming, single-pass manner.
It improves active learning performance on public datasets.
Abstract
This note explores probabilistic sampling weighted by uncertainty in active learning. This method has been previously used and authors have tangentially remarked on its efficacy. The scheme has several benefits: (1) it is computationally cheap, (2) it can be implemented in a single-pass streaming fashion which is a benefit when deployed in real-world systems where different subsystems perform the suggestion scoring and extraction of user feedback, and (3) it is easily parameterizable. In this paper, we show on publicly available datasets that using probabilistic weighting is often beneficial and strikes a good compromise between exploration and representation especially when the starting set of labelled points is biased.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Data Stream Mining Techniques · Advanced Bandit Algorithms Research
