ETTA: Efficient Test-Time Adaptation for Vision-Language Models through Dynamic Embedding Updates
Hamidreza Dastmalchi, Aijun An, Ali cheraghian

TL;DR
ETTA introduces a dynamic, recursive embedding update mechanism for vision-language models that enhances test-time adaptation efficiency and accuracy by integrating all incoming test data and reducing prompt dependency.
Contribution
The paper proposes ETTA, a novel test-time adaptation method with recursive embedding updates and adaptive ensemble, improving upon cache-based approaches for vision-language models.
Findings
ETTA outperforms state-of-the-art TTA models in accuracy.
ETTA reduces computational complexity and memory usage.
ETTA effectively adapts to distribution shifts in benchmark tests.
Abstract
Pretrained vision-language models (VLMs) like CLIP show strong zero-shot performance but struggle with generalization under distribution shifts. Test-Time Adaptation (TTA) addresses this by adapting VLMs to unlabeled test data in new domains. While some TTA methods rely on prompt-tuning, training-free cache-based approaches are preferred for efficiency. However, current cache-based TTA models store only a limited set of high-confidence samples, restricting the decision boundary to these samples and ignoring the influence of other incoming test data. To address this, we propose Efficient Test-Time Adaptation (ETTA), introducing a Recursive Updating module that integrates all incoming test samples, progressively refining the decision boundary. This strategy mimics an unbounded cache, dynamically updating contextual embeddings for improved accuracy with minimal memory and computational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Topic Modeling
