A Novel Benchmark for Few-Shot Semantic Segmentation in the Era of Foundation Models

Reda Bensaid; Vincent Gripon; Fran\c{c}ois Leduc-Primeau; Lukas Mauch; Ghouthi Boukli Hacene; Fabien Cardinaux

arXiv:2401.11311·cs.CV·June 4, 2025·2 cites

A Novel Benchmark for Few-Shot Semantic Segmentation in the Era of Foundation Models

Reda Bensaid, Vincent Gripon, Fran\c{c}ois Leduc-Primeau, Lukas Mauch, Ghouthi Boukli Hacene, Fabien Cardinaux

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new benchmark for evaluating how well foundation vision models can be adapted for few-shot semantic segmentation, revealing the importance of feature extractors and adaptation methods.

Contribution

It presents the first study on adapting vision foundation models for few-shot semantic segmentation and proposes a realistic benchmark for this task.

Findings

01

Self-supervised models can outperform segmentation-specific models.

02

Parameter-efficient fine-tuning yields competitive results.

03

The feature extractor plays a critical role in adaptation performance.

Abstract

Few-shot semantic segmentation (FSS) is a crucial challenge in computer vision, driving extensive research into a diverse range of methods, from advanced meta-learning techniques to simple transfer learning baselines. With the emergence of vision foundation models (VFM) serving as generalist feature extractors, we seek to explore the adaptation of these models for FSS. While current FSS benchmarks focus on adapting pre-trained models to new tasks with few images, they emphasize in-domain generalization, making them less suitable for VFM trained on large-scale web datasets. To address this, we propose a novel realistic benchmark with a simple and straightforward adaptation process tailored for this task. Using this benchmark, we conduct a comprehensive comparative analysis of prominent VFM and semantic segmentation models. To evaluate their effectiveness, we leverage various adaption…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

redabensaidds/foundation_fewshot
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications

MethodsAttention Is All You Need · Softmax · Layer Normalization · Residual Connection · Linear Layer · Multi-Head Attention · Dense Connections · Vision Transformer · self-DIstillation with NO labels · Contrastive Language-Image Pre-training