GraphAdapter: Tuning Vision-Language Models With Dual Knowledge Graph

Xin Li; Dongze Lian; Zhihe Lu; Jiawang Bai; Zhibo Chen; and Xinchao; Wang

arXiv:2309.13625·cs.CV·September 26, 2023·21 cites

GraphAdapter: Tuning Vision-Language Models With Dual Knowledge Graph

Xin Li, Dongze Lian, Zhihe Lu, Jiawang Bai, Zhibo Chen, and Xinchao, Wang

PDF

Open Access 1 Repo 1 Video

TL;DR

GraphAdapter introduces a dual knowledge graph approach to enhance vision-language model tuning by explicitly modeling inter-class relationships across visual and textual modalities, leading to improved performance on multiple benchmarks.

Contribution

The paper proposes GraphAdapter, an adapter-style tuning method that models dual-modality knowledge graphs to better exploit inter-class relationships in vision-language models.

Findings

01

Significantly outperforms previous adapter-based methods on 11 benchmarks.

02

Effectively leverages inter-class relationships across modalities.

03

Enhances classifier performance with dual knowledge graph modeling.

Abstract

Adapter-style efficient transfer learning (ETL) has shown excellent performance in the tuning of vision-language models (VLMs) under the low-data regime, where only a few additional parameters are introduced to excavate the task-specific knowledge based on the general and powerful representation of VLMs. However, most adapter-style works face two limitations: (i) modeling task-specific knowledge with a single modality only; and (ii) overlooking the exploitation of the inter-class relationships in downstream tasks, thereby leading to sub-optimal solutions. To mitigate that, we propose an effective adapter-style tuning strategy, dubbed GraphAdapter, which performs the textual adapter by explicitly modeling the dual-modality structure knowledge (i.e., the correlation of different semantics/classes in textual and visual modalities) with a dual knowledge graph. In particular, the dual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lixinustc/graphadapter
pytorchOfficial

Videos

GraphAdapter: Tuning Vision-Language Models With Dual Knowledge Graph· slideslive

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Graph Neural Networks

MethodsAdapter