Boosting Active Learning via Improving Test Performance

Tianyang Wang; Xingjian Li; Pengkun Yang; Guosheng Hu; Xiangrui Zeng,; Siyu Huang; Cheng-Zhong Xu; Min Xu

arXiv:2112.05683·cs.LG·January 25, 2022

Boosting Active Learning via Improving Test Performance

Tianyang Wang, Xingjian Li, Pengkun Yang, Guosheng Hu, Xiangrui Zeng,, Siyu Huang, Cheng-Zhong Xu, Min Xu

PDF

1 Repo 1 Video

TL;DR

This paper introduces a novel active learning framework that improves test performance by selecting unlabeled data based on gradient norm estimates, validated across multiple tasks including image classification, segmentation, and cellular imaging.

Contribution

It proposes two schemes, expected-gradnorm and entropy-gradnorm, to estimate gradient norms for unlabeled data, enhancing active learning effectiveness.

Findings

01

Achieves superior performance over state-of-the-art methods.

02

Effective across diverse tasks including image classification and cellular imaging.

03

Demonstrates robustness to noise and domain shifts.

Abstract

Central to active learning (AL) is what data should be selected for annotation. Existing works attempt to select highly uncertain or informative data for annotation. Nevertheless, it remains unclear how selected data impacts the test performance of the task model used in AL. In this work, we explore such an impact by theoretically proving that selecting unlabeled data of higher gradient norm leads to a lower upper-bound of test loss, resulting in better test performance. However, due to the lack of label information, directly computing gradient norm for unlabeled data is infeasible. To address this challenge, we propose two schemes, namely expected-gradnorm and entropy-gradnorm. The former computes the gradient norm by constructing an expected empirical loss while the latter constructs an unsupervised loss with entropy. Furthermore, we integrate the two schemes in a universal AL…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xulabs/aitom
pytorchOfficial

Videos

Boosting Active Learning via Improving Test Performance· underline