Provably Neural Active Learning Succeeds via Prioritizing Perplexing   Samples

Dake Bu; Wei Huang; Taiji Suzuki; Ji Cheng; Qingfu Zhang; Zhiqiang Xu,; Hau-San Wong

arXiv:2406.03944·cs.LG·June 7, 2024

Provably Neural Active Learning Succeeds via Prioritizing Perplexing Samples

Dake Bu, Wei Huang, Taiji Suzuki, Ji Cheng, Qingfu Zhang, Zhiqiang Xu,, Hau-San Wong

PDF

Open Access

TL;DR

This paper provides a theoretical explanation for the success of neural active learning methods, showing that both uncertainty and diversity criteria aim to prioritize samples with yet-to-be-learned features, leading to improved test accuracy.

Contribution

It offers a unified feature learning explanation for the success of uncertainty and diversity-based neural active learning, supported by theoretical analysis and experiments.

Findings

01

Both query criteria focus on samples with yet-to-be-learned features.

02

Prioritizing such samples leads to small test error with fewer labeled data.

03

Passive learning requires more labels to achieve similar accuracy.

Abstract

Neural Network-based active learning (NAL) is a cost-effective data selection technique that utilizes neural networks to select and train on a small subset of samples. While existing work successfully develops various effective or theory-justified NAL algorithms, the understanding of the two commonly used query criteria of NAL: uncertainty-based and diversity-based, remains in its infancy. In this work, we try to move one step forward by offering a unified explanation for the success of both query criteria-based NAL from a feature learning view. Specifically, we consider a feature-noise data model comprising easy-to-learn or hard-to-learn features disrupted by noise, and conduct analysis over 2-layer NN-based NALs in the pool-based scenario. We provably show that both uncertainty-based and diversity-based NAL are inherently amenable to one and the same principle, i.e., striving to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Algorithms