Vocabulary-free few-shot learning for Vision-Language Models

Maxime Zanella; Cl\'ement Fuchs; Ismail Ben Ayed; Christophe De Vleeschouwer

arXiv:2506.04005·cs.CV·June 5, 2025

Vocabulary-free few-shot learning for Vision-Language Models

Maxime Zanella, Cl\'ement Fuchs, Ismail Ben Ayed, Christophe De Vleeschouwer

PDF

Open Access

TL;DR

This paper introduces a vocabulary-free few-shot learning method for Vision-Language Models that classifies images based on similarity to generic prompts, removing the need for class-specific labels or prompts.

Contribution

It proposes the Similarity Mapping (SiM) baseline, enabling efficient, interpretable classification without predefined class names, expanding few-shot learning applicability.

Findings

01

SiM achieves strong performance on benchmark tasks.

02

The method operates with high computational efficiency, often under one second.

03

Provides interpretability by linking classes to generic prompts.

Abstract

Recent advances in few-shot adaptation for Vision-Language Models (VLMs) have greatly expanded their ability to generalize across tasks using only a few labeled examples. However, existing approaches primarily build upon the strong zero-shot priors of these models by leveraging carefully designed, task-specific prompts. This dependence on predefined class names can restrict their applicability, especially in scenarios where exact class names are unavailable or difficult to specify. To address this limitation, we introduce vocabulary-free few-shot learning for VLMs, a setting where target class instances - that is, images - are available but their corresponding names are not. We propose Similarity Mapping (SiM), a simple yet effective baseline that classifies target instances solely based on similarity scores with a set of generic prompts (textual or visual), eliminating the need for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications