# Enhanced uncertainty sampling with category information for improved active learning

**Authors:** Xiaochuan Wang, Bo Zhang, Fei Wang, Tao Bao, Zhiqing Lu, Jiawei Bao

PMC · DOI: 10.1371/journal.pone.0327694 · PLOS One · 2025-07-07

## TL;DR

This paper introduces an improved active learning method that balances sample selection across classes by incorporating category information into uncertainty sampling.

## Contribution

The novel framework integrates category features with uncertainty sampling to achieve balanced and efficient data annotation in multi-class computer vision tasks.

## Key findings

- The method achieves competitive mAP scores in object detection with balanced category representation.
- For image classification, it matches state-of-the-art accuracy while reducing computational overhead by up to 80%.
- Experiments show the approach effectively balances sampling efficiency and dataset representativeness.

## Abstract

Traditional uncertainty sampling methods in active learning often neglect category information, leading to imbalanced sample selection in multi-class computer vision tasks. Our approach integrates category information with uncertainty sampling through a novel active learning framework to address this limitation. Our method employs a pre-trained VGG16 architecture and cosine similarity metrics to efficiently extract category features without requiring additional model training. The framework combines these features with traditional uncertainty measures to ensure balanced sampling across classes while maintaining computational efficiency. Extensive experiments across both object detection and image classification tasks validate our method’s effectiveness. For object detection, our approach achieves competitive mAP scores while ensuring balanced category representation. For image classification, our method achieves accuracy comparable to state-of-the-art approaches while reducing computational overhead by up to 80%. The results validate our approach’s ability to balance sampling efficiency with dataset representativeness across different computer vision tasks. This work offers a practical, efficient solution for large-scale data annotation in domains with limited labeled data and diverse class distributions.

## Full-text entities

- **Chemicals:** CSA (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12233261/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12233261/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/PMC12233261/full.md

---
Source: https://tomesphere.com/paper/PMC12233261