Active Learning on a Budget: Opposite Strategies Suit High and Low Budgets
Guy Hacohen, Avihu Dekel, Daphna Weinshall

TL;DR
This paper explores how active learning strategies should adapt to different budget sizes, introducing TypiClust, a new method that excels in low-budget scenarios and significantly improves semi-supervised learning performance.
Contribution
The paper presents TypiClust, a novel deep active learning strategy optimized for low budgets, supported by theoretical analysis and empirical validation across multiple datasets.
Findings
TypiClust outperforms existing strategies in low-budget regimes.
Using TypiClust in semi-supervised learning greatly enhances accuracy.
State-of-the-art results achieved with minimal labeled data on CIFAR-10.
Abstract
Investigating active learning, we focus on the relation between the number of labeled examples (budget size), and suitable querying strategies. Our theoretical analysis shows a behavior reminiscent of phase transition: typical examples are best queried when the budget is low, while unrepresentative examples are best queried when the budget is large. Combined evidence shows that a similar phenomenon occurs in common classification models. Accordingly, we propose TypiClust -- a deep active learning strategy suited for low budgets. In a comparative empirical investigation of supervised learning, using a variety of architectures and image datasets, TypiClust outperforms all other active learning strategies in the low-budget regime. Using TypiClust in the semi-supervised framework, performance gets an even more significant boost. In particular, state-of-the-art semi-supervised methods…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · COVID-19 diagnosis using AI
