Foundation Model Makes Clustering A Better Initialization For Cold-Start Active Learning
Han Yuan, Chuan Hong

TL;DR
This paper introduces a novel approach that integrates foundation models with clustering techniques to improve the selection of initial samples for cold-start active learning, resulting in better model performance especially on high-dimensional image data.
Contribution
The study proposes using foundation model embeddings to enhance clustering for cold-start active learning initialization, outperforming traditional random or naive clustering methods.
Findings
Foundation model-based clustering selects more informative initial samples.
Enhanced model performance on clinical image classification and segmentation tasks.
Faster convergence of clustering with foundation model embeddings.
Abstract
Active learning selects the most informative samples from the unlabelled dataset to annotate in the context of a limited annotation budget. While numerous methods have been proposed for subsequent sample selection based on an initialized model, scant attention has been paid to the indispensable phase of active learning: selecting samples for model cold-start initialization. Most of the previous studies resort to random sampling or naive clustering. However, random sampling is prone to fluctuation, and naive clustering suffers from convergence speed, particularly when dealing with high-dimensional data such as imaging data. In this work, we propose to integrate foundation models with clustering methods to select samples for cold-start active learning initialization. Foundation models refer to those trained on massive datasets by the self-supervised paradigm and capable of generating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms
