Unsupervised Domain Adaption Harnessing Vision-Language Pre-training

Wenlve Zhou; Zhiheng Zhou

arXiv:2408.02192·cs.CV·August 6, 2024

Unsupervised Domain Adaption Harnessing Vision-Language Pre-training

Wenlve Zhou, Zhiheng Zhou

PDF

1 Repo

TL;DR

This paper introduces a novel approach for Unsupervised Domain Adaptation using Vision-Language Pre-training models, combining Cross-Modal Knowledge Distillation and Residual Sparse Training to improve performance and reduce storage needs.

Contribution

The paper proposes a new method leveraging VLP models for UDA, introducing CMKD and RST to enhance performance and efficiency over existing techniques.

Findings

01

Achieves state-of-the-art results on standard benchmarks.

02

Reduces storage overhead significantly compared to traditional fine-tuning.

03

Demonstrates the effectiveness of VLP models in UDA tasks.

Abstract

This paper addresses two vital challenges in Unsupervised Domain Adaptation (UDA) with a focus on harnessing the power of Vision-Language Pre-training (VLP) models. Firstly, UDA has primarily relied on ImageNet pre-trained models. However, the potential of VLP models in UDA remains largely unexplored. The rich representation of VLP models holds significant promise for enhancing UDA tasks. To address this, we propose a novel method called Cross-Modal Knowledge Distillation (CMKD), leveraging VLP models as teacher models to guide the learning process in the target domain, resulting in state-of-the-art performance. Secondly, current UDA paradigms involve training separate models for each task, leading to significant storage overhead and impractical model deployment as the number of transfer tasks grows. To overcome this challenge, we introduce Residual Sparse Training (RST) exploiting the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Wenlve-Zhou/VLP-UDA
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsKnowledge Distillation · Focus · FixMatch