UMFC: Unsupervised Multi-Domain Feature Calibration for Vision-Language   Models

Jiachen Liang; Ruibing Hou; Minyang Hu; Hong Chang; Shiguang Shan,; Xilin Chen

arXiv:2411.06921·cs.CV·November 12, 2024

UMFC: Unsupervised Multi-Domain Feature Calibration for Vision-Language Models

Jiachen Liang, Ruibing Hou, Minyang Hu, Hong Chang, Shiguang Shan,, Xilin Chen

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces UMFC, a training-free, label-free method to calibrate features in vision-language models like CLIP, reducing domain bias and improving zero-shot transfer across multiple domains without additional labeled data.

Contribution

The paper proposes a novel unsupervised, training-free feature calibration technique to mitigate domain bias in CLIP, enhancing its transferability without extra annotations or optimization.

Findings

01

UMFC effectively reduces domain bias in CLIP's features.

02

Our method outperforms baseline CLIP in multiple domain transfer tasks.

03

UMFC achieves comparable results to state-of-the-art methods requiring labels or training.

Abstract

Pre-trained vision-language models (e.g., CLIP) have shown powerful zero-shot transfer capabilities. But they still struggle with domain shifts and typically require labeled data to adapt to downstream tasks, which could be costly. In this work, we aim to leverage unlabeled data that naturally spans multiple domains to enhance the transferability of vision-language models. Under this unsupervised multi-domain setting, we have identified inherent model bias within CLIP, notably in its visual and text encoders. Specifically, we observe that CLIP's visual encoder tends to prioritize encoding domain over discriminative category information, meanwhile its text encoder exhibits a preference for domain-relevant classes. To mitigate this model bias, we propose a training-free and label-free feature calibration method, Unsupervised Multi-domain Feature Calibration (UMFC). UMFC estimates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

git-ljc/umfc
pytorchOfficial

Videos

UMFC: Unsupervised Multi-Domain Feature Calibration for Vision-Language Models· slideslive

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques

MethodsContrastive Language-Image Pre-training