Prmpt2Adpt: Prompt-Based Zero-Shot Domain Adaptation for Resource-Constrained Environments
Yasir Ali Farrukh, Syed Wali, Irfan Khan, Nathaniel D. Bastian

TL;DR
Prmpt2Adpt introduces a prompt-guided, zero-shot domain adaptation framework for resource-limited vision systems, enabling fast and efficient adaptation with minimal data and computational resources.
Contribution
It presents a novel lightweight, prompt-based domain adaptation method using a teacher-student model with CLIP, suitable for low-resource environments.
Findings
Achieves competitive detection performance on MDS-A dataset.
Up to 7x faster adaptation speed compared to state-of-the-art.
5x faster inference speed with few source images.
Abstract
Unsupervised Domain Adaptation (UDA) is a critical challenge in real-world vision systems, especially in resource-constrained environments like drones, where memory and computation are limited. Existing prompt-driven UDA methods typically rely on large vision-language models and require full access to source-domain data during adaptation, limiting their applicability. In this work, we propose Prmpt2Adpt, a lightweight and efficient zero-shot domain adaptation framework built around a teacher-student paradigm guided by prompt-based feature alignment. At the core of our method is a distilled and fine-tuned CLIP model, used as the frozen backbone of a Faster R-CNN teacher. A small set of low-level source features is aligned to the target domain semantics-specified only through a natural language prompt-via Prompt-driven Instance Normalization (PIN). These semantically steered features are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Face recognition and analysis
MethodsConvolution · Softmax · Region Proposal Network · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Faster R-CNN · Instance Normalization · Contrastive Language-Image Pre-training · Sparse Evolutionary Training
