Pareto Optimization for Active Learning under Out-of-Distribution Data   Scenarios

Xueying Zhan; Zeyu Dai; Qingzhong Wang; Qing Li; Haoyi Xiong; Dejing; Dou; Antoni B. Chan

arXiv:2207.01190·cs.LG·July 5, 2022·1 cites

Pareto Optimization for Active Learning under Out-of-Distribution Data Scenarios

Xueying Zhan, Zeyu Dai, Qingzhong Wang, Qing Li, Haoyi Xiong, Dejing, Dou, Antoni B. Chan

PDF

Open Access

TL;DR

This paper introduces a Pareto optimization-based active learning method that effectively handles out-of-distribution data by balancing informativeness and OOD confidence, improving sampling strategies in complex data scenarios.

Contribution

It proposes Monte-Carlo Pareto Optimization for Active Learning (POAL), a novel multi-objective sampling scheme addressing OOD challenges in AL.

Findings

01

Effective in classical ML tasks

02

Improves OOD sample detection

03

Enhances AL performance in DL tasks

Abstract

Pool-based Active Learning (AL) has achieved great success in minimizing labeling cost by sequentially selecting informative unlabeled samples from a large unlabeled data pool and querying their labels from oracle/annotators. However, existing AL sampling strategies might not work well in out-of-distribution (OOD) data scenarios, where the unlabeled data pool contains some data samples that do not belong to the classes of the target task. Achieving good AL performance under OOD data scenarios is a challenging task due to the natural conflict between AL sampling strategies and OOD sample detection. AL selects data that are hard to be classified by the current basic classifier (e.g., samples whose predicted class probabilities have high entropy), while OOD samples tend to have more uniform predicted class probabilities (i.e., high entropy) than in-distribution (ID) data. In this paper, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Advanced Bandit Algorithms Research