Efficient and Versatile Robust Fine-Tuning of Zero-shot Models
Sungyeon Kim, Boseung Jeong, Donghyun Kim, Suha Kwak

TL;DR
This paper introduces R-Adapter, a lightweight fine-tuning method for zero-shot models that enhances out-of-distribution robustness and versatility across multiple vision-language tasks with minimal parameter updates.
Contribution
The paper presents R-Adapter, a novel lightweight fine-tuning approach with self-ensemble and MPM-NCE loss, extending robust fine-tuning to diverse tasks beyond classification.
Findings
Achieves state-of-the-art performance on various tasks.
Tunes only 13% of model parameters.
Significantly improves out-of-distribution robustness.
Abstract
Large-scale image-text pre-trained models enable zero-shot classification and provide consistent accuracy across various data distributions. Nonetheless, optimizing these models in downstream tasks typically requires fine-tuning, which reduces generalization to out-of-distribution (OOD) data and demands extensive computational resources. We introduce Robust Adapter (R-Adapter), a novel method for fine-tuning zero-shot models to downstream tasks while simultaneously addressing both these issues. Our method integrates lightweight modules into the pre-trained model and employs novel self-ensemble techniques to boost OOD robustness and reduce storage expenses substantially. Furthermore, we propose MPM-NCE loss designed for fine-tuning on vision-language downstream tasks. It ensures precise alignment of multiple image-text pairs and discriminative feature learning. By extending the benchmark…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Nuclear reactor physics and engineering · Advanced Image Processing Techniques
MethodsSparse Evolutionary Training · Contrastive Language-Image Pre-training · Adapter
