Downstream-Pretext Domain Knowledge Traceback for Active Learning

Beichen Zhang; Liang Li; Zheng-Jun Zha; Jiebo Luo; Qingming Huang

arXiv:2407.14720·cs.LG·July 23, 2024

Downstream-Pretext Domain Knowledge Traceback for Active Learning

Beichen Zhang, Liang Li, Zheng-Jun Zha, Jiebo Luo, Qingming Huang

PDF

TL;DR

This paper introduces DOKT, a novel active learning method that leverages downstream knowledge traceback and domain-aware uncertainty estimation to select diverse, informative samples, improving data efficiency across multiple tasks.

Contribution

DOKT uniquely combines downstream knowledge traceback with domain-based uncertainty estimation for enhanced sample selection in active learning.

Findings

01

Outperforms state-of-the-art active learning methods on ten datasets.

02

Effective in diverse scenarios like semantic segmentation and image captioning.

03

Unifies low-level pretext and high-level downstream representations for better sample diversity.

Abstract

Active learning (AL) is designed to construct a high-quality labeled dataset by iteratively selecting the most informative samples. Such sampling heavily relies on data representation, while recently pre-training is popular for robust feature learning. However, as pre-training utilizes low-level pretext tasks that lack annotation, directly using pre-trained representation in AL is inadequate for determining the sampling score. To address this problem, we propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance for selecting diverse and instructive samples near the decision boundary. DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator. The diversity indicator constructs two feature spaces based on the pre-training pretext model and the downstream…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.