Downstream-Pretext Domain Knowledge Traceback for Active Learning
Beichen Zhang, Liang Li, Zheng-Jun Zha, Jiebo Luo, Qingming Huang

TL;DR
This paper introduces DOKT, a novel active learning method that leverages downstream knowledge traceback and domain-aware uncertainty estimation to select diverse, informative samples, improving data efficiency across multiple tasks.
Contribution
DOKT uniquely combines downstream knowledge traceback with domain-based uncertainty estimation for enhanced sample selection in active learning.
Findings
Outperforms state-of-the-art active learning methods on ten datasets.
Effective in diverse scenarios like semantic segmentation and image captioning.
Unifies low-level pretext and high-level downstream representations for better sample diversity.
Abstract
Active learning (AL) is designed to construct a high-quality labeled dataset by iteratively selecting the most informative samples. Such sampling heavily relies on data representation, while recently pre-training is popular for robust feature learning. However, as pre-training utilizes low-level pretext tasks that lack annotation, directly using pre-trained representation in AL is inadequate for determining the sampling score. To address this problem, we propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance for selecting diverse and instructive samples near the decision boundary. DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator. The diversity indicator constructs two feature spaces based on the pre-training pretext model and the downstream…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
