Uncertainty Herding: One Active Learning Method for All Label Budgets

Wonho Bae; Gabriel L. Oliveira; Danica J. Sutherland

arXiv:2412.20644·cs.LG·February 28, 2025

Uncertainty Herding: One Active Learning Method for All Label Budgets

Wonho Bae, Gabriel L. Oliveira, Danica J. Sutherland

PDF

Open Access 3 Reviews

TL;DR

This paper introduces Uncertainty Herding, a versatile active learning method that adapts seamlessly across different label budget regimes, outperforming existing methods in both low- and high-budget scenarios.

Contribution

It proposes uncertainty coverage as a unified objective and develops Uncertainty Herding, a simple, fast, and nearly optimal method that works well across all label budgets.

Findings

01

Outperforms state-of-the-art in various tasks

02

Works effectively in both low- and high-budget regimes

03

Nearly optimizes distribution-level coverage

Abstract

Most active learning research has focused on methods which perform well when many labels are available, but can be dramatically worse than random selection when label budgets are small. Other methods have focused on the low-budget regime, but do poorly as label budgets increase. As the line between "low" and "high" budgets varies by problem, this is a serious issue in practice. We propose uncertainty coverage, an objective which generalizes a variety of low- and high-budget objectives, as well as natural, hyperparameter-light methods to smoothly interpolate between low- and high-budget regimes. We call greedy optimization of the estimate Uncertainty Herding; this simple method is computationally fast, and we prove that it nearly optimizes the distribution-level coverage. In experimental validation across a variety of active learning tasks, our proposal matches or beats state-of-the-art…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 6Confidence 3

Strengths

- The problem of navigating active learning strategies across budget regimes seems interesting and relevant to the field, as highlighted by recent work [Hacohen and Weinshall, 2023]. - Balancing uncertainty and representativeness has been explored before in active learning, as discussed by the authors. The notion of uncertainty coverage introduced in this paper is built upon that of generalized coverage studied in [Bae et al., 2024] -- it is defined by weighting generalized coverage with the mo

Weaknesses

- The theoretical results provide some insights but are primarily limited to a generalization guarantee for the empirical estimator of UCoverage and an approximation guarantee of the greedy algorithm. It would be interesting to see a deeper analysis that directly addresses the core goals in active learning, i.e., reducing error rates with as few labels as possible. For example, how does UCoverage perform as an objective in active learning in terms of *excess risk* and *label complexity* (potenti

Reviewer 02Rating 8Confidence 3

Strengths

UHerding successfully addresses both low- and high-budget scenarios in a unified approach, eliminating the need to switch frameworks and tackling the practical challenge of unclear budget boundaries. This method is also practical, as it doesn’t demand high training costs. This innovative approach is a strong contribution. The paper provides rigorous theoretical analysis into the estimation quality and parameter adaptation of UHerding. Experiments span multiple datasets and scenarios, showing UH

Weaknesses

The method relies on having suitable pre-trained feature extractors and an accurate approximation of the data distribution, which might not always be feasible in real-world situations. While it shows strong performance in supervised tasks, its effectiveness in transfer learning could benefit from additional validation against approaches tailored to specific domains.

Reviewer 03Rating 6Confidence 2

Strengths

The theoretical results provide insights into algorithm design, especially how to make proposed method robust across different budget levels.

Weaknesses

in figure 5, there is a sudden increase in difference in accuracy from 7.2k to 8k budget. This increase applies to all methods and goes against the general trend observed in figure 5, 6, and 7. Could the authors provide some insight on why that may happen? A similar hikes happens in figure 6(a). I suspect this is due to the small amount of random seeds used. The error bars in figure 6(a) are quite high, and these hikes might be caused by an outlier point. Could the authors provide results for

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques · Mathematics, Computing, and Information Processing · Semantic Web and Ontologies

MethodsAttentive Walk-Aggregating Graph Neural Network