Mind the Discriminability Trap in Source-Free Cross-domain Few-shot Learning

Zhenyu Zhang; Yixiong Zou; Yuhua Li; Ruixuan Li; Guangyao Chen

arXiv:2603.13341·cs.CV·March 17, 2026

Mind the Discriminability Trap in Source-Free Cross-domain Few-shot Learning

Zhenyu Zhang, Yixiong Zou, Yuhua Li, Ruixuan Li, Guangyao Chen

PDF

Open Access

TL;DR

This paper investigates why increasing visual discriminability can harm performance in source-free cross-domain few-shot learning with vision-language models and proposes a method to improve cross-modal alignment during fine-tuning.

Contribution

It reveals the counterintuitive phenomenon that enhancing visual discriminability suppresses VLM performance and proposes a perturbation-based approach to improve cross-modal alignment.

Findings

01

Achieves state-of-the-art results on multiple datasets.

02

Demonstrates the importance of cross-modal alignment in SF-CDFSL.

03

Provides theoretical and experimental insights into the discriminability trap.

Abstract

Source-Free Cross-Domain Few-Shot Learning (SF-CDFSL) focuses on fine-tuning with limited training data from target domains (e.g., medical or satellite images), where Vision-Language Models (VLMs) such as CLIP and SigLIP have shown promising results. Current works in traditional visual models suggest that improving visual discriminability enhances performance. However, in VLM-based SF-CDFSL tasks, we find that \textbf{strengthening visual-modal discriminability actually suppresses VLMs' performance}. In this paper, we aim to delve into this phenomenon for an interpretation and a solution. By both theoretical and experimental proofs, our study reveals that fine-tuning with the typical cross-entropy loss ( $L_{vlm}$ ) inherently includes a visual learning part and a cross-modal learning part, where the cross-modal part is crucial for rectifying the heavily disrupted…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications