TL;DR
This paper introduces a class-balanced active learning framework that improves image classification performance on imbalanced datasets by explicitly considering class distribution, and it enhances existing active learning methods.
Contribution
It proposes a general optimization framework for class-balanced active learning that can be integrated with existing algorithms, improving performance on imbalanced and balanced datasets.
Findings
Effective on imbalanced datasets with long-tail distributions
Compatible with most existing active learning algorithms
Yields performance gains even on balanced datasets
Abstract
Active learning aims to reduce the labeling effort that is required to train algorithms by learning an acquisition function selecting the most relevant data for which a label should be requested from a large unlabeled data pool. Active learning is generally studied on balanced datasets where an equal amount of images per class is available. However, real-world datasets suffer from severe imbalanced classes, the so called long-tail distribution. We argue that this further complicates the active learning process, since the imbalanced data pool can result in suboptimal classifiers. To address this problem in the context of active learning, we proposed a general optimization framework that explicitly takes class-balancing into account. Results on three datasets showed that the method is general (it can be combined with most existing active learning algorithms) and can be effectively applied…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Class-Balanced Active Learning for Image Classification· youtube
