Transitive Vision-Language Prompt Learning for Domain Generalization

Liyuan Wang; Yan Jin; Zhen Chen; Jinlin Wu; Mengke Li; Yang Lu; Hanzi; Wang

arXiv:2404.18758·cs.CV·April 30, 2024·1 cites

Transitive Vision-Language Prompt Learning for Domain Generalization

Liyuan Wang, Yan Jin, Zhen Chen, Jinlin Wu, Mengke Li, Yang Lu, Hanzi, Wang

PDF

Open Access

TL;DR

This paper proposes a novel vision-language prompt learning approach that balances domain invariance and class separability, significantly enhancing model generalization across unseen domains with state-of-the-art results.

Contribution

It introduces a deep vision prompt and language prompt strategy with adaptive weighting to improve domain generalization in vision-language models.

Findings

01

Achieves state-of-the-art performance on three datasets.

02

Deep vision prompts effectively extract domain-invariant features.

03

Balancing domain invariance and class separability improves generalization.

Abstract

The vision-language pre-training has enabled deep models to make a huge step forward in generalizing across unseen domains. The recent learning method based on the vision-language pre-training model is a great tool for domain generalization and can solve this problem to a large extent. However, there are still some issues that an advancement still suffers from trading-off between domain invariance and class separability, which are crucial in current DG problems. However, there are still some issues that an advancement still suffers from trading-off between domain invariance and class separability, which are crucial in current DG problems. In this paper, we introduce a novel prompt learning strategy that leverages deep vision prompts to address domain invariance while utilizing language prompts to ensure class separability, coupled with adaptive weighting mechanisms to balance domain…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications