Combining inherent knowledge of vision-language models with unsupervised   domain adaptation through strong-weak guidance

Thomas Westfechtel; Dexuan Zhang; Tatsuya Harada

arXiv:2312.04066·cs.CV·December 2, 2024·2 cites

Combining inherent knowledge of vision-language models with unsupervised domain adaptation through strong-weak guidance

Thomas Westfechtel, Dexuan Zhang, Tatsuya Harada

PDF

Open Access 1 Repo

TL;DR

This paper presents a novel approach combining unsupervised domain adaptation with the inherent zero-shot capabilities of vision-language models, using a strong-weak guidance scheme to improve cross-domain image classification.

Contribution

It introduces a strong-weak guidance learning scheme that leverages zero-shot predictions and knowledge distillation to enhance unsupervised domain adaptation in vision-language models.

Findings

01

Outperforms state-of-the-art methods on three benchmarks.

02

Effective integration of zero-shot predictions improves adaptation.

03

Ablation studies confirm the contribution of each component.

Abstract

Unsupervised domain adaptation (UDA) tries to overcome the tedious work of labeling data by leveraging a labeled source dataset and transferring its knowledge to a similar but different target dataset. Meanwhile, current vision-language models exhibit remarkable zero-shot prediction capabilities. In this work, we combine knowledge gained through UDA with the inherent knowledge of vision-language models. We introduce a strong-weak guidance learning scheme that employs zero-shot predictions to help align the source and target dataset. For the strong guidance, we expand the source dataset with the most confident samples of the target dataset. Additionally, we employ a knowledge distillation loss as weak guidance. The strong guidance uses hard labels but is only applied to the most confident predictions from the target dataset. Conversely, the weak guidance is employed to the whole dataset…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ThomasWestfechtel/SWG
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Cancer-related molecular mechanisms research

MethodsALIGN · Knowledge Distillation