Gradient constrained sharpness-aware prompt learning for vision-language   models

Liangchen Liu; Nannan Wang; Dawei Zhou; Xinbo Gao; Decheng Liu; Xi; Yang; Tongliang Liu

arXiv:2309.07866·cs.CV·September 21, 2023

Gradient constrained sharpness-aware prompt learning for vision-language models

Liangchen Liu, Nannan Wang, Dawei Zhou, Xinbo Gao, Decheng Liu, Xi, Yang, Tongliang Liu

PDF

Open Access

TL;DR

This paper introduces GCSCoOp, a novel prompt learning method for vision-language models that balances performance on seen and unseen classes by dynamically constraining the gradient during optimization.

Contribution

It proposes a gradient-constrained SAM-based approach to improve the trade-off between seen and unseen class performance in prompt learning.

Findings

01

GCSCoOp outperforms existing methods in balancing seen and unseen class accuracy.

02

The method effectively constrains gradient to optimize both loss value and sharpness.

03

Experimental results verify the superiority of GCSCoOp in practical scenarios.

Abstract

This paper targets a novel trade-off problem in generalizable prompt learning for vision-language models (VLM), i.e., improving the performance on unseen classes while maintaining the performance on seen classes. Comparing with existing generalizable methods that neglect the seen classes degradation, the setting of this problem is more strict and fits more closely with practical applications. To solve this problem, we start from the optimization perspective, and leverage the relationship between loss landscape geometry and model generalization ability. By analyzing the loss landscapes of the state-of-the-art method and vanilla Sharpness-aware Minimization (SAM) based method, we conclude that the trade-off performance correlates to both loss value and loss sharpness, while each of them is indispensable. However, we find the optimizing gradient of existing methods cannot maintain high…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques

MethodsSharpness-Aware Minimization