Cleaning the Pool: Progressive Filtering of Unlabeled Pools in Deep Active Learning

Denis Huseljic; Marek Herde; Lukas Rauch; Paul Hahn; Bernhard Sick

arXiv:2511.22344·cs.LG·March 27, 2026

Cleaning the Pool: Progressive Filtering of Unlabeled Pools in Deep Active Learning

Denis Huseljic, Marek Herde, Lukas Rauch, Paul Hahn, Bernhard Sick

PDF

Open Access

TL;DR

REFINE is a novel ensemble active learning method that progressively filters unlabeled data pools by combining multiple strategies, leading to more effective data selection and improved performance across various datasets and models.

Contribution

We introduce REFINE, an ensemble active learning approach that adaptively filters unlabeled pools using multiple strategies, enhancing data selection and outperforming existing methods.

Findings

01

REFINE outperforms individual strategies and existing ensemble methods across 6 datasets.

02

Progressive filtering improves the effectiveness of any active learning strategy.

03

The method enhances performance in an audio spectrogram classification task.

Abstract

Existing active learning (AL) strategies capture fundamentally different notions of data value, e.g., uncertainty or representativeness. Consequently, the effectiveness of strategies can vary substantially across datasets, models, and even AL cycles. Committing to a single strategy risks suboptimal performance, as no single strategy dominates throughout the entire AL process. We introduce REFINE, an ensemble AL method that combines multiple strategies without knowing in advance which will perform best. In each AL cycle, REFINE operates in two stages: (1) Progressive filtering iteratively refines the unlabeled pool by considering an ensemble of AL strategies, retaining promising candidates capturing different notions of value. (2) Coverage-based selection then chooses a final batch from this refined pool, ensuring all previously identified notions of value are accounted for. Extensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning