Solving Semi-Supervised Few-Shot Learning from an Auto-Annotation Perspective
Tian Liu, Anwesha Basu, James Caverlee, Shu Kong

TL;DR
This paper introduces SWIFT, a simple yet effective method for semi-supervised few-shot learning that leverages open-source vision-language models, significantly improving performance by addressing softmax distribution issues.
Contribution
The paper proposes SWIFT, a novel stage-wise finetuning approach with temperature tuning, to enhance semi-supervised few-shot learning using open-source vision-language models.
Findings
SWIFT outperforms recent FSL and SSL methods by approximately 5 accuracy points.
SWIFT rivals supervised learning performance on SSFSL benchmarks.
Simple techniques like classifier initialization and temperature tuning improve pseudo-label confidence.
Abstract
Semi-supervised few-shot learning (SSFSL) formulates real-world applications like ''auto-annotation'', as it aims to learn a model over a few labeled and abundant unlabeled examples to annotate the unlabeled ones. Despite the availability of powerful open-source Vision-Language Models (VLMs) and their pretraining data, the SSFSL literature largely neglects these open-source resources. In contrast, the related area few-shot learning (FSL) has already exploited them to boost performance. Arguably, to achieve auto-annotation in the real world, SSFSL should leverage such open-source resources. To this end, we start by applying established SSL methods to finetune a VLM. Counterintuitively, they significantly underperform FSL baselines. Our in-depth analysis reveals the root cause: VLMs produce rather ''flat'' distributions of softmax probabilities. This results in zero utilization of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications
