Rethinking the Need for Source Models: Source-Free Domain Adaptation from Scratch Guided by a Vision-Language Model

Zhou Bingtao; Xiang Mian; Ning Qian

arXiv:2605.02604·cs.CV·May 5, 2026

Rethinking the Need for Source Models: Source-Free Domain Adaptation from Scratch Guided by a Vision-Language Model

Zhou Bingtao, Xiang Mian, Ning Qian

PDF

TL;DR

This paper introduces a new source-free domain adaptation setting that relies solely on a vision-language model and unlabeled target data, eliminating dependence on source models, and proposes a two-stage framework that achieves competitive results.

Contribution

The paper proposes VODA, a stricter source-free domain adaptation setting, and introduces TS-DRD, a two-stage method that effectively guides adaptation without source models.

Findings

01

TS-DRD achieves competitive performance on multiple benchmarks.

02

VODA setting demonstrates that source models have limited impact on adaptation.

03

The method effectively utilizes vision-language guidance for domain adaptation.

Abstract

Source-Free Domain Adaptation (SFDA) adapts source models to target domains without accessing source data, addressing privacy and transmission issues. However, existing methods still initialize from a source pre-trained model and thus are not truly source-free. Recent works have introduced Vision-Language (ViL) models to guide the adaptation process, in these methods, we observe that for the same target domain, different source models yield minimal variation in final results, indicating the source model itself has limited impact. Motivated by this, we propose ViL-Only Domain Adaptation (VODA) , a stricter setting that eliminates all dependencies on source domain, relying solely on a randomly initialized model, a ViL model, and unlabeled target data. We analyze the adaptation dynamics of VODA and introduce Two-Stage Denoised-Region Distillation (TS-DRD) , a two-stage framework that first…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.