Efficient and Versatile Robust Fine-Tuning of Zero-shot Models

Sungyeon Kim; Boseung Jeong; Donghyun Kim; Suha Kwak

arXiv:2408.05749·cs.CV·August 13, 2024

Efficient and Versatile Robust Fine-Tuning of Zero-shot Models

Sungyeon Kim, Boseung Jeong, Donghyun Kim, Suha Kwak

PDF

Open Access

TL;DR

This paper introduces R-Adapter, a lightweight fine-tuning method for zero-shot models that enhances out-of-distribution robustness and versatility across multiple vision-language tasks with minimal parameter updates.

Contribution

The paper presents R-Adapter, a novel lightweight fine-tuning approach with self-ensemble and MPM-NCE loss, extending robust fine-tuning to diverse tasks beyond classification.

Findings

01

Achieves state-of-the-art performance on various tasks.

02

Tunes only 13% of model parameters.

03

Significantly improves out-of-distribution robustness.

Abstract

Large-scale image-text pre-trained models enable zero-shot classification and provide consistent accuracy across various data distributions. Nonetheless, optimizing these models in downstream tasks typically requires fine-tuning, which reduces generalization to out-of-distribution (OOD) data and demands extensive computational resources. We introduce Robust Adapter (R-Adapter), a novel method for fine-tuning zero-shot models to downstream tasks while simultaneously addressing both these issues. Our method integrates lightweight modules into the pre-trained model and employs novel self-ensemble techniques to boost OOD robustness and reduce storage expenses substantially. Furthermore, we propose MPM-NCE loss designed for fine-tuning on vision-language downstream tasks. It ensures precise alignment of multiple image-text pairs and discriminative feature learning. By extending the benchmark…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Nuclear reactor physics and engineering · Advanced Image Processing Techniques

MethodsSparse Evolutionary Training · Contrastive Language-Image Pre-training · Adapter