MSGCoOp: Multiple Semantic-Guided Context Optimization for Few-Shot Learning

Zhaolong Wang; Tongfeng Sun; Mingzheng Du; Yachao Huang

arXiv:2507.21786·cs.CV·July 30, 2025

MSGCoOp: Multiple Semantic-Guided Context Optimization for Few-Shot Learning

Zhaolong Wang, Tongfeng Sun, Mingzheng Du, Yachao Huang

PDF

TL;DR

MSGCoOp introduces a semantic-guided, ensemble prompt optimization framework for vision-language models, significantly improving few-shot and cross-domain generalization with efficient computation.

Contribution

The paper proposes a novel ensemble of semantic-guided prompts with diversity regularization, enhancing generalization in vision-language models without heavy computational costs.

Findings

01

Improves base-to-novel generalization by 1.10% harmonic mean.

02

Enhances robustness in cross-domain tasks.

03

Outperforms baseline methods on 11 benchmark datasets.

Abstract

Vision-language pre-trained models (VLMs) such as CLIP have demonstrated remarkable zero-shot generalization, and prompt learning has emerged as an efficient alternative to full fine-tuning. However, existing methods often struggle with generalization to novel classes, a phenomenon attributed to overfitting on seen classes and forgetting general knowledge. Furthermore, recent approaches that improve generalization often introduce complex architectures or heavy computational overhead. In this paper, we propose a Multiple Semantic-Guided Context Optimization (MSGCoOp) framework to enhance few-shot generalization while maintaining computational efficiency. Our approach leverages an ensemble of parallel learnable context vectors to capture diverse semantic aspects. To enrich these prompts, we introduce a semantic guidance mechanism that aligns them with comprehensive class descriptions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.