In Search of the Successful Interpolation: On the Role of Sharpness in   CLIP Generalization

Alireza Abdollahpoorrostam

arXiv:2410.16476·cs.LG·October 23, 2024

In Search of the Successful Interpolation: On the Role of Sharpness in CLIP Generalization

Alireza Abdollahpoorrostam

PDF

Open Access 1 Repo

TL;DR

This paper investigates how layer-wise sharpness, especially in straggler layers, influences the generalization of CLIP models during Robust Fine-Tuning, challenging traditional beliefs about flat minima and proposing new insights for model robustness.

Contribution

It introduces the concept of layer-wise sharpness in straggler layers as a reliable indicator of CLIP's OOD generalization during interpolation, and explores sparsity to improve robustness.

Findings

01

Layer-wise sharpness correlates with OOD generalization in CLIP.

02

Sharpness is not a reliable predictor of generalization for modern architectures.

03

Sparsity in straggler layers mitigates failure modes in RFT.

Abstract

\textit{Zero-shot} models like CLIP are often fine-tuned on a target dataset to improve its accuracy further, but this can compromise out-of-distribution (OOD) robustness. Robust Fine-Tuning (\texttt{RFT} )~\citep{wortsman2021robust}, which interpolates between the \textit{zero-shot} and \textit{fine-tuned} models, has been proposed to address this issue. However, understanding when \texttt{RFT} actually improves OOD error remains limited. In this work, we empirically investigate the robustness of \texttt{RFT} in CLIP models, with a focus on the \textit{sharpness} of the CLIP model during interpolation. First, we demonstrate that while sharpness may not serve as a reliable indicator for predicting the generalization of modern architectures like CLIP on OOD data, this challenges the conventional belief in the generalization benefits of flat minima in foundation models. However, by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alirezaabdollahpour/clip_mode_connectivity
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLinguistic Studies and Language Acquisition · Mathematics, Computing, and Information Processing · Natural Language Processing Techniques

MethodsContrastive Language-Image Pre-training · Focus