A Scalable Algorithm for Active Learning
Youguang Chen, Zheyu Wen, George Biros

TL;DR
This paper introduces a scalable, approximate active learning algorithm for multiclass classification that maintains accuracy while significantly reducing computational and storage costs, and demonstrates its effectiveness on large datasets with GPU acceleration.
Contribution
It proposes an efficient approximate algorithm for active learning that scales better than FIRAL and includes a GPU parallel implementation, with maintained accuracy on large datasets.
Findings
Achieves comparable accuracy to FIRAL on benchmark datasets.
Demonstrates strong and weak scaling on up to 12 GPUs with three million points.
Reduces storage and computational complexity significantly.
Abstract
FIRAL is a recently proposed deterministic active learning algorithm for multiclass classification using logistic regression. It was shown to outperform the state-of-the-art in terms of accuracy and robustness and comes with theoretical performance guarantees. However, its scalability suffers when dealing with datasets featuring a large number of points , dimensions , and classes , due to its storage and computational complexity where is the number of points to select in active learning. To address these challenges, we propose an approximate algorithm with storage requirements reduced to and a computational complexity of . Additionally, we present a parallel implementation on GPUs. We demonstrate the accuracy and scalability of our approach using MNIST,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Design · Experimental Learning in Engineering
