Gradient constrained sharpness-aware prompt learning for vision-language models
Liangchen Liu, Nannan Wang, Dawei Zhou, Xinbo Gao, Decheng Liu, Xi, Yang, Tongliang Liu

TL;DR
This paper introduces GCSCoOp, a novel prompt learning method for vision-language models that balances performance on seen and unseen classes by dynamically constraining the gradient during optimization.
Contribution
It proposes a gradient-constrained SAM-based approach to improve the trade-off between seen and unseen class performance in prompt learning.
Findings
GCSCoOp outperforms existing methods in balancing seen and unseen class accuracy.
The method effectively constrains gradient to optimize both loss value and sharpness.
Experimental results verify the superiority of GCSCoOp in practical scenarios.
Abstract
This paper targets a novel trade-off problem in generalizable prompt learning for vision-language models (VLM), i.e., improving the performance on unseen classes while maintaining the performance on seen classes. Comparing with existing generalizable methods that neglect the seen classes degradation, the setting of this problem is more strict and fits more closely with practical applications. To solve this problem, we start from the optimization perspective, and leverage the relationship between loss landscape geometry and model generalization ability. By analyzing the loss landscapes of the state-of-the-art method and vanilla Sharpness-aware Minimization (SAM) based method, we conclude that the trade-off performance correlates to both loss value and loss sharpness, while each of them is indispensable. However, we find the optimizing gradient of existing methods cannot maintain high…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
MethodsSharpness-Aware Minimization
