Deep Prompt Tuning for Graph Transformers
Reza Shirkavand, Heng Huang

TL;DR
This paper introduces deep graph prompt tuning, a resource-efficient method that enhances graph transformer models by adding trainable tokens, achieving comparable or better performance than fine-tuning with fewer parameters.
Contribution
The paper presents a novel prompt tuning approach for graph transformers, reducing resource requirements and improving scalability while maintaining high performance.
Findings
Achieves comparable or superior performance to fine-tuning.
Reduces the number of task-specific parameters significantly.
Applicable to various graph neural network architectures.
Abstract
Graph transformers have gained popularity in various graph-based tasks by addressing challenges faced by traditional Graph Neural Networks. However, the quadratic complexity of self-attention operations and the extensive layering in graph transformer architectures present challenges when applying them to graph based prediction tasks. Fine-tuning, a common approach, is resource-intensive and requires storing multiple copies of large models. We propose a novel approach called deep graph prompt tuning as an alternative to fine-tuning for leveraging large graph transformer models in downstream graph based prediction tasks. Our method introduces trainable feature nodes to the graph and pre-pends task-specific tokens to the graph transformer, enhancing the model's expressive power. By freezing the pre-trained parameters and only updating the added tokens, our approach reduces the number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Online Learning and Analytics
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Byte Pair Encoding · Softmax · Dense Connections · Laplacian EigenMap · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Residual Connection
