A Scalable Algorithm for Active Learning

Youguang Chen; Zheyu Wen; George Biros

arXiv:2409.07392·cs.LG·September 12, 2024

A Scalable Algorithm for Active Learning

Youguang Chen, Zheyu Wen, George Biros

PDF

Open Access

TL;DR

This paper introduces a scalable, approximate active learning algorithm for multiclass classification that maintains accuracy while significantly reducing computational and storage costs, and demonstrates its effectiveness on large datasets with GPU acceleration.

Contribution

It proposes an efficient approximate algorithm for active learning that scales better than FIRAL and includes a GPU parallel implementation, with maintained accuracy on large datasets.

Findings

01

Achieves comparable accuracy to FIRAL on benchmark datasets.

02

Demonstrates strong and weak scaling on up to 12 GPUs with three million points.

03

Reduces storage and computational complexity significantly.

Abstract

FIRAL is a recently proposed deterministic active learning algorithm for multiclass classification using logistic regression. It was shown to outperform the state-of-the-art in terms of accuracy and robustness and comes with theoretical performance guarantees. However, its scalability suffers when dealing with datasets featuring a large number of points $n$ , dimensions $d$ , and classes $c$ , due to its $O (c^{2} d^{2} + n c^{2} d)$ storage and $O (c^{3} (n d^{2} + b d^{3} + bn))$ computational complexity where $b$ is the number of points to select in active learning. To address these challenges, we propose an approximate algorithm with storage requirements reduced to $O (n (d + c) + c d^{2})$ and a computational complexity of $O (bn c d^{2})$ . Additionally, we present a parallel implementation on GPUs. We demonstrate the accuracy and scalability of our approach using MNIST,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Design · Experimental Learning in Engineering