CLIPArTT: Adaptation of CLIP to New Domains at Test Time
Gustavo Adolfo Vargas Hakim, David Osowiechi, Mehrdad Noori and, Milad Cheraghalikhani, Ali Bahri, Moslem Yazdanpanah, Ismail Ben, Ayed, Christian Desrosiers

TL;DR
CLIPArTT introduces a test-time adaptation method for CLIP that constructs automatic text prompts during inference, improving robustness across various datasets without additional training or model modifications.
Contribution
The paper presents a novel, minimally invasive test-time adaptation technique for CLIP that uses pseudo labels from aggregated class prompts, and standardizes TTA benchmarks for vision-language models.
Findings
Enhances CLIP performance on corrupted and synthetic datasets
Does not require additional training or model modifications
Improves robustness across diverse datasets
Abstract
Pre-trained vision-language models (VLMs), exemplified by CLIP, demonstrate remarkable adaptability across zero-shot classification tasks without additional training. However, their performance diminishes in the presence of domain shifts. In this study, we introduce CLIP Adaptation duRing Test-Time (CLIPArTT), a fully test-time adaptation (TTA) approach for CLIP, which involves automatic text prompts construction during inference for their use as text supervision. Our method employs a unique, minimally invasive text prompt tuning process, wherein multiple predicted classes are aggregated into a single new text prompt, used as \emph{pseudo label} to re-classify inputs in a transductive manner. Additionally, we pioneer the standardization of TTA benchmarks (e.g., TENT) in the realm of VLMs. Our findings demonstrate that, without requiring additional transformations nor new trainable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnalytical Chemistry and Sensors
MethodsContrastive Language-Image Pre-training
