How To Overcome Confirmation Bias in Semi-Supervised Image Classification By Active Learning
Sandra Gilhuber, Rasmus Hvingelby, Mang Ling Ada Fok, Thomas Seidl

TL;DR
This paper investigates how active learning can mitigate confirmation bias in semi-supervised image classification under realistic data challenges, demonstrating that active learning can outperform random sampling in such scenarios.
Contribution
It introduces three real-world data challenges and shows that active learning can effectively overcome confirmation bias in semi-supervised learning where random sampling fails.
Findings
Active learning mitigates confirmation bias in realistic data scenarios.
Random sampling can worsen performance compared to supervised learning.
Active semi-supervised methods outperform random sampling in challenging data conditions.
Abstract
Do we need active learning? The rise of strong deep semi-supervised methods raises doubt about the usability of active learning in limited labeled data settings. This is caused by results showing that combining semi-supervised learning (SSL) methods with a random selection for labeling can outperform existing active learning (AL) techniques. However, these results are obtained from experiments on well-established benchmark datasets that can overestimate the external validity. However, the literature lacks sufficient research on the performance of active semi-supervised learning methods in realistic data scenarios, leaving a notable gap in our understanding. Therefore we present three data challenges common in real-world applications: between-class imbalance, within-class imbalance, and between-class similarity. These challenges can hurt SSL performance due to confirmation bias. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification
