NTKCPL: Active Learning on Top of Self-Supervised Model by Estimating   True Coverage

Ziting Wen; Oscar Pizarro; Stefan Williams

arXiv:2306.04099·cs.LG·June 8, 2023·1 cites

NTKCPL: Active Learning on Top of Self-Supervised Model by Estimating True Coverage

Ziting Wen, Oscar Pizarro, Stefan Williams

PDF

Open Access

TL;DR

This paper introduces NTKCPL, a new active learning strategy that leverages self-supervised models and NTK approximation to better estimate empirical risk, enabling more effective data annotation strategies across datasets.

Contribution

The paper proposes NTKCPL, a novel active learning method using NTK-based pseudo-label clustering to improve risk estimation and strategy selection in self-supervised learning contexts.

Findings

01

NTKCPL outperforms baseline methods on five datasets.

02

It remains effective over a wider range of training budgets.

03

The method reduces approximation error in risk estimation.

Abstract

High annotation cost for training machine learning classifiers has driven extensive research in active learning and self-supervised learning. Recent research has shown that in the context of supervised learning different active learning strategies need to be applied at various stages of the training process to ensure improved performance over the random baseline. We refer to the point where the number of available annotations changes the suitable active learning strategy as the phase transition point. In this paper, we establish that when combining active learning with self-supervised models to achieve improved performance, the phase transition point occurs earlier. It becomes challenging to determine which strategy should be used for previously unseen datasets. We argue that existing active learning algorithms are heavily influenced by the phase transition because the empirical risk…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Domain Adaptation and Few-Shot Learning · Topic Modeling

MethodsNeural Tangent Kernel