Advancing Reliable Test-Time Adaptation of Vision-Language Models under Visual Variations

Yiwen Liang; Hui Chen; Yizhe Xiong; Zihan Zhou; Mengyao Lyu; Zijia Lin; Shuaicheng Niu; Sicheng Zhao; Jungong Han; Guiguang Ding

arXiv:2507.09500·cs.CV·August 14, 2025

Advancing Reliable Test-Time Adaptation of Vision-Language Models under Visual Variations

Yiwen Liang, Hui Chen, Yizhe Xiong, Zihan Zhou, Mengyao Lyu, Zijia Lin, Shuaicheng Niu, Sicheng Zhao, Jungong Han, Guiguang Ding

PDF

Open Access

TL;DR

This paper introduces ReTA, a novel method for improving the reliability of test-time adaptation in vision-language models under visual variations by addressing entropy unreliability and decision boundary inflexibility.

Contribution

ReTA combines consistency-aware entropy reweighting and diversity-driven distribution calibration to enhance robustness and accuracy during test-time adaptation.

Findings

01

ReTA outperforms existing methods under real-world distribution shifts.

02

The proposed methods improve cache quality and decision boundary flexibility.

03

ReTA demonstrates consistent performance gains across multiple benchmarks.

Abstract

Vision-language models (VLMs) exhibit remarkable zero-shot capabilities but struggle with distribution shifts in downstream tasks when labeled data is unavailable, which has motivated the development of Test-Time Adaptation (TTA) to improve VLMs' performance during inference without annotations. Among various TTA approaches, cache-based methods show promise by preserving historical knowledge from low-entropy samples in a dynamic cache and fostering efficient adaptation. However, these methods face two critical reliability challenges: (1) entropy often becomes unreliable under distribution shifts, causing error accumulation in the cache and degradation in adaptation performance; (2) the final predictions may be unreliable due to inflexible decision boundaries that fail to accommodate large downstream shifts. To address these challenges, we propose a Reliable Test-time Adaptation (ReTA)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications