Towards Calibrated Robust Fine-Tuning of Vision-Language Models
Changdae Oh, Hyesu Lim, Mijoo Kim, Dongyoon Han, Sangdoo Yun, Jaegul, Choo, Alexander Hauptmann, Zhi-Qi Cheng, Kyungwoo Song

TL;DR
This paper introduces a robust fine-tuning method for vision-language models that enhances out-of-distribution accuracy and confidence calibration by leveraging a novel theoretical insight and a constrained contrastive loss.
Contribution
It presents a new framework that improves OOD performance and calibration by enforcing a larger smallest singular value during fine-tuning, guided by self-distillation.
Findings
Improved OOD accuracy on ImageNet benchmarks.
Enhanced confidence calibration in vision-language models.
Theoretical bounds linking calibration errors and data covariance.
Abstract
Improving out-of-distribution (OOD) generalization during in-distribution (ID) adaptation is a primary goal of robust fine-tuning of zero-shot models beyond naive fine-tuning. However, despite decent OOD generalization performance from recent robust fine-tuning methods, confidence calibration for reliable model output has not been fully addressed. This work proposes a robust fine-tuning method that improves both OOD accuracy and confidence calibration simultaneously in vision language models. Firstly, we show that both OOD classification and OOD calibration errors have a shared upper bound consisting of two terms of ID data: 1) ID calibration error and 2) the smallest singular value of the ID input covariance matrix. Based on this insight, we design a novel framework that conducts fine-tuning with a constrained multimodal contrastive loss enforcing a larger smallest singular value,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · COVID-19 diagnosis using AI
