Robust Fine-tuning of Zero-shot Models via Variance Reduction

Beier Zhu; Jiequan Cui; Hanwang Zhang

arXiv:2411.06966·cs.CV·November 12, 2024

Robust Fine-tuning of Zero-shot Models via Variance Reduction

Beier Zhu, Jiequan Cui, Hanwang Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces VRF, a novel fine-tuning method for zero-shot models that reduces prediction variance, improving robustness and accuracy on both in-distribution and out-of-distribution data without trade-offs.

Contribution

The paper proposes a sample-wise ensembling technique called Variance Reduction Fine-tuning (VRF) that enhances robustness of zero-shot models by reducing prediction variance during fine-tuning.

Findings

01

VRF improves OOD accuracy by 1.5-2.0 percentage points over ensemble baselines.

02

VRF maintains or increases ID accuracy while boosting robustness.

03

VRF achieves significant gains across multiple distribution shift benchmarks.

Abstract

When fine-tuning zero-shot models like CLIP, our desideratum is for the fine-tuned model to excel in both in-distribution (ID) and out-of-distribution (OOD). Recently, ensemble-based models (ESM) have been shown to offer significant robustness improvement, while preserving high ID accuracy. However, our study finds that ESMs do not solve the ID-OOD trade-offs: they achieve peak performance for ID and OOD accuracy at different mixing coefficients. When optimized for OOD accuracy, the ensemble model exhibits a noticeable decline in ID accuracy, and vice versa. In contrast, we propose a sample-wise ensembling technique that can simultaneously attain the best ID and OOD accuracy without the trade-offs. Specifically, we construct a Zero-Shot Failure (ZSF) set containing training samples incorrectly predicted by the zero-shot model. For each test sample, we calculate its distance to the ZSF…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

beierzhu/vrf
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Medical Imaging Techniques and Applications · Nuclear Physics and Applications

MethodsSparse Evolutionary Training · Contrastive Language-Image Pre-training