Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation
Kun Ding, Qiang Yu, Haojian Zhang, Gaofeng Meng, Shiming, Xiang

TL;DR
This paper introduces a calibrated cache model for few-shot vision-language adaptation, incorporating similarity, weight, and confidence calibrations to improve accuracy and reliability over existing methods.
Contribution
The work proposes novel calibration modules and variants that enhance cache-based VLM adaptation by addressing similarity, relational, and confidence issues, achieving state-of-the-art results.
Findings
Achieves state-of-the-art performance on 11 few-shot classification datasets.
Effectively models training sample relations with Gaussian Process regression.
Improves confidence estimation to enhance prediction reliability.
Abstract
Cache-based approaches stand out as both effective and efficient for adapting vision-language models (VLMs). Nonetheless, the existing cache model overlooks three crucial aspects. 1) Pre-trained VLMs are mainly optimized for image-text similarity, neglecting the importance of image-image similarity, leading to a gap between pre-training and adaptation. 2) The current cache model is based on the Nadaraya-Watson (N-W) estimator, which disregards the intricate relationships among training samples while constructing weight function. 3) Under the condition of limited samples, the logits generated by cache model are of high uncertainty, directly using these logits without accounting for the confidence could be problematic. This work presents three calibration modules aimed at addressing the above challenges. Similarity Calibration refines the image-image similarity by using unlabeled images.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · COVID-19 diagnosis using AI
MethodsResidual Connection · Gaussian Process · Contrastive Language-Image Pre-training
