Uniformity First: Uniformity-aware Test-time Adaptation of Vision-language Models against Image Corruption

Kazuki Adachi; Shin'ya Yamaguchi; Tomoki Hamagami

arXiv:2505.12912·cs.CV·May 20, 2025

Uniformity First: Uniformity-aware Test-time Adaptation of Vision-language Models against Image Corruption

Kazuki Adachi, Shin'ya Yamaguchi, Tomoki Hamagami

PDF

Open Access 1 Repo

TL;DR

This paper introduces UnInfo, a test-time adaptation method for vision-language models like CLIP, which enhances robustness against sensor degradation by maintaining embedding uniformity and information balance during adaptation.

Contribution

The paper proposes a novel uniformity-aware test-time adaptation method called UnInfo that specifically addresses sensor degradation in vision-language models, a challenge not tackled by existing methods.

Findings

01

UnInfo improves accuracy on sensor-degraded images.

02

Maintains embedding uniformity and information balance during adaptation.

03

Outperforms existing TTA methods under sensor degradation conditions.

Abstract

Pre-trained vision-language models such as contrastive language-image pre-training (CLIP) have demonstrated a remarkable generalizability, which has enabled a wide range of applications represented by zero-shot classification. However, vision-language models still suffer when they face datasets with large gaps from training ones, i.e., distribution shifts. We found that CLIP is especially vulnerable to sensor degradation, a type of realistic distribution shift caused by sensor conditions such as weather, light, or noise. Collecting a new dataset from a test distribution for fine-tuning highly costs since sensor degradation occurs unexpectedly and has a range of variety. Thus, we investigate test-time adaptation (TTA) of zero-shot classification, which enables on-the-fly adaptation to the test distribution with unlabeled test data. Existing TTA methods for CLIP mainly focus on modifying…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kzkadc/uninfo
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Adversarial Robustness in Machine Learning

MethodsFocus · Contrastive Language-Image Pre-training · Knowledge Distillation