iVPT: Improving Task-relevant Information Sharing in Visual Prompt   Tuning by Cross-layer Dynamic Connection

Nan Zhou; Jiaxin Chen; Di Huang

arXiv:2404.05207·cs.CV·April 9, 2024·1 cites

iVPT: Improving Task-relevant Information Sharing in Visual Prompt Tuning by Cross-layer Dynamic Connection

Nan Zhou, Jiaxin Chen, Di Huang

PDF

Open Access

TL;DR

iVPT introduces a cross-layer dynamic connection and an attentive reinforcement mechanism to improve task-relevant information sharing in visual prompt tuning, leading to better performance across various vision tasks.

Contribution

The paper proposes iVPT, a novel VPT method with cross-layer dynamic connections and an attentive reinforcement mechanism for enhanced task-relevant information sharing.

Findings

01

Outperforms state-of-the-art methods on 24 benchmarks.

02

Effectively shares task-relevant information across layers.

03

Enhances attention process flexibility in VPT.

Abstract

Recent progress has shown great potential of visual prompt tuning (VPT) when adapting pre-trained vision transformers to various downstream tasks. However, most existing solutions independently optimize prompts at each layer, thereby neglecting the usage of task-relevant information encoded in prompt tokens across layers. Additionally, existing prompt structures are prone to interference from task-irrelevant noise in input images, which can do harm to the sharing of task-relevant information. In this paper, we propose a novel VPT approach, \textbf{iVPT}. It innovatively incorporates a cross-layer dynamic connection (CDC) for input prompt tokens from adjacent layers, enabling effective sharing of task-relevant information. Furthermore, we design a dynamic aggregation (DA) module that facilitates selective sharing of information between layers. The combination of CDC and DA enhances the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPersonal Information Management and User Behavior · Virtual Reality Applications and Impacts