FVG-PT: Adaptive Foreground View-Guided Prompt Tuning for Vision-Language Models
Haoyang Li, Liang Wang, Siyu Zhou, Jiacheng Sun, Jing Jiang, Chao Wang, Guodong Long, Yan Peng

TL;DR
This paper introduces FVG-PT, an adaptive prompt tuning method for vision-language models that guides visual attention towards foreground objects, improving task adaptation and addressing attention shift issues.
Contribution
FVG-PT proposes a novel foreground attention guidance module with a reliability gate, distillation, and calibration to enhance prompt tuning in VLMs.
Findings
FVG-PT improves foreground attention alignment across models.
Enhanced tuning results on multiple datasets.
Demonstrates compatibility with various backbone models.
Abstract
CLIP-based prompt tuning enables pretrained Vision-Language Models (VLMs) to efficiently adapt to downstream tasks. Although existing studies have made significant progress, they pay limited attention to changes in the internal attention representations of VLMs during the tuning process. In this paper, we attribute the failure modes of prompt tuning predictions to shifts in foreground attention of the visual encoder, and propose Foreground View-Guided Prompt Tuning (FVG-PT), an adaptive plug-and-play foreground attention guidance module, to alleviate the shifts. Concretely, FVG-PT introduces a learnable Foreground Reliability Gate to automatically enhance the foreground view quality, applies a Foreground Distillation Compensation module to guide visual attention toward the foreground, and further introduces a Prior Calibration module to mitigate generalization degradation caused by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
