On weighted uncertainty sampling in active learning

Vinay Jethava

arXiv:1909.04928·cs.LG·September 12, 2019

On weighted uncertainty sampling in active learning

Vinay Jethava

PDF

Open Access

TL;DR

This paper investigates probabilistic uncertainty sampling in active learning, highlighting its computational efficiency, ease of implementation, and effectiveness in balancing exploration and representation, especially with biased initial labeled data.

Contribution

It demonstrates that weighted uncertainty sampling is beneficial in active learning, offering a practical, efficient approach that improves performance with biased starting datasets.

Findings

01

Weighted sampling is computationally cheap.

02

It can be implemented in a streaming, single-pass manner.

03

It improves active learning performance on public datasets.

Abstract

This note explores probabilistic sampling weighted by uncertainty in active learning. This method has been previously used and authors have tangentially remarked on its efficacy. The scheme has several benefits: (1) it is computationally cheap, (2) it can be implemented in a single-pass streaming fashion which is a benefit when deployed in real-world systems where different subsystems perform the suggestion scoring and extraction of user feedback, and (3) it is easily parameterizable. In this paper, we show on publicly available datasets that using probabilistic weighting is often beneficial and strikes a good compromise between exploration and representation especially when the starting set of labelled points is biased.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Data Stream Mining Techniques · Advanced Bandit Algorithms Research