Training-Free Test-Time Adaptation with Brownian Distance Covariance in Vision-Language Models

Yi Zhang; Chun-Wun Cheng; Angelica I. Aviles-Rivero; Zhihai He; Liang-Jie Zhang

arXiv:2601.23253·cs.CV·February 2, 2026

Training-Free Test-Time Adaptation with Brownian Distance Covariance in Vision-Language Models

Yi Zhang, Chun-Wun Cheng, Angelica I. Aviles-Rivero, Zhihai He, Liang-Jie Zhang

PDF

Open Access

TL;DR

This paper introduces TaTa, a training-free, efficient test-time adaptation method for vision-language models that uses Brownian Distance Covariance to improve domain generalization without retraining.

Contribution

It proposes a novel, training-free adaptation approach leveraging Brownian Distance Covariance, enhancing efficiency and stability in vision-language models under domain shift.

Findings

01

Significantly reduces computational cost compared to existing methods.

02

Achieves state-of-the-art performance in domain and cross-dataset generalization.

03

Improves model stability by avoiding weight updates.

Abstract

Vision-language models suffer performance degradation under domain shift, limiting real-world applicability. Existing test-time adaptation methods are computationally intensive, rely on back-propagation, and often focus on single modalities. To address these issues, we propose Training-free Test-Time Adaptation with Brownian Distance Covariance (TaTa). TaTa leverages Brownian Distance Covariance-a powerful statistical measure that captures both linear and nonlinear dependencies via pairwise distances-to dynamically adapt VLMs to new domains without training or back-propagation. This not only improves efficiency but also enhances stability by avoiding disruptive weight updates. TaTa further integrates attribute-enhanced prompting to improve vision-language inference with descriptive visual cues. Combined with dynamic clustering and pseudo-label refinement, it effectively recalibrates the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Face recognition and analysis