Scalable Batch Acquisition for Deep Bayesian Active Learning
Aleksandr Rubashevskii, Daria Kotova, Maxim Panov

TL;DR
This paper introduces Large BatchBALD, an efficient approximation of BatchBALD for deep Bayesian active learning, enabling scalable selection of multiple data points with reduced computational complexity, validated on image and text datasets.
Contribution
The paper proposes Large BatchBALD, a scalable and computationally efficient algorithm for batch selection in Bayesian active learning, addressing limitations of existing methods.
Findings
Significant reduction in computation time for large batch sizes.
Comparable selection quality to BatchBALD in experiments.
Effective on both image and text datasets, including CIFAR-100.
Abstract
In deep active learning, it is especially important to choose multiple examples to markup at each step to work efficiently, especially on large datasets. At the same time, existing solutions to this problem in the Bayesian setup, such as BatchBALD, have significant limitations in selecting a large number of examples, associated with the exponential complexity of computing mutual information for joint random variables. We, therefore, present the Large BatchBALD algorithm, which gives a well-grounded approximation to the BatchBALD method that aims to achieve comparable quality while being more computationally efficient. We provide a complexity analysis of the algorithm, showing a reduction in computation time, especially for large batches. Furthermore, we present an extensive set of experimental results on image and text data, both on toy datasets and larger ones such as CIFAR-100.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Algorithms and Data Compression · Topic Modeling
