UCDR-Adapter: Exploring Adaptation of Pre-Trained Vision-Language Models for Universal Cross-Domain Retrieval
Haoyu Jiang, Zhi-Qi Cheng, Gabriel Moreira, Jiawen Zhu, Jingdong Sun,, Bukun Ren, Jun-Yan He, Qi Dai, and Xian-Sheng Hua

TL;DR
UCDR-Adapter introduces a novel method combining adapters and dynamic prompt generation to improve universal cross-domain retrieval, enabling better adaptation to unseen domains and classes without relying on semantic labels.
Contribution
It proposes a two-phase training strategy with source adapter learning and target prompt generation, enhancing pre-trained vision-language models for flexible, zero-shot retrieval across diverse domains.
Findings
Outperforms state-of-the-art methods on multiple UCDR benchmarks
Demonstrates robust generalization to unseen domains and classes
Achieves higher retrieval accuracy with dynamic prompts during inference
Abstract
Universal Cross-Domain Retrieval (UCDR) retrieves relevant images from unseen domains and classes without semantic labels, ensuring robust generalization. Existing methods commonly employ prompt tuning with pre-trained vision-language models but are inherently limited by static prompts, reducing adaptability. We propose UCDR-Adapter, which enhances pre-trained models with adapters and dynamic prompt generation through a two-phase training strategy. First, Source Adapter Learning integrates class semantics with domain-specific visual knowledge using a Learnable Textual Semantic Template and optimizes Class and Domain Prompts via momentum updates and dual loss functions for robust alignment. Second, Target Prompt Generation creates dynamic prompts by attending to masked source prompts, enabling seamless adaptation to unseen domains and classes. Unlike prior approaches, UCDR-Adapter…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques
MethodsAdapter
