Balancing Uncertainty and Diversity of Samples: Leveraging Diversity of Least, High Confidence Samples for Effective Active Learning

Vipul Arya; S.H. Shabbeer Basha; Srikrishna U N; Sunainha Vijay; Snehasis Mukherjee

arXiv:2605.22169·cs.CV·May 22, 2026

Balancing Uncertainty and Diversity of Samples: Leveraging Diversity of Least, High Confidence Samples for Effective Active Learning

Vipul Arya, S.H. Shabbeer Basha, Srikrishna U N, Sunainha Vijay, Snehasis Mukherjee

PDF

1 Repo

TL;DR

This paper introduces four hybrid active learning sampling methods that select both easy and hard, yet diverse, samples to improve model training efficiency, with LCD outperforming existing methods.

Contribution

The paper proposes novel hybrid sampling strategies for active learning that combine uncertainty and diversity, notably the LCD method, with extensive experimental validation.

Findings

01

LCD consistently outperforms state-of-the-art methods.

02

Selecting uncertain and diverse samples enhances feature learning.

03

Hybrid sampling improves data efficiency in active learning.

Abstract

Deep learning models, including Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), have achieved state-of-the-art performance on various computer vision tasks such as object classification, detection, segmentation, generation, and many more. However, these models are data-hungry as they require more training data to learn millions or billions of parameters. Especially for supervised learning tasks, curating a large number of labeled samples for model training is an expensive and time-consuming task. Active Learning (AL) has been used to address this problem for many years. Existing active learning methods aim at choosing the samples for annotation from a pool of unlabeled samples that are either diverse or uncertain. Choosing such samples may hinder the model's performance as we pool based on one dimension, i.e., either diverse or uncertain. In this paper, we propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

XXX/LCD
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.