Representing Long-Range Context for Graph Neural Networks with Global Attention
Zhanghao Wu, Paras Jain, Matthew A. Wright, Azalia Mirhoseini, Joseph, E. Gonzalez, Ion Stoica

TL;DR
This paper introduces GraphTrans, a Transformer-based method that enhances graph neural networks by effectively capturing long-range dependencies, leading to state-of-the-art results in graph classification tasks.
Contribution
The paper proposes a novel permutation-invariant Transformer module integrated with GNNs to better model long-range relationships in graphs, surpassing existing methods.
Findings
GraphTrans achieves state-of-the-art performance on several graph classification benchmarks.
Purely-learning-based approaches without explicit graph structure can effectively learn high-level relationships.
The method outperforms traditional GNNs that struggle with long-range dependencies.
Abstract
Graph neural networks are powerful architectures for structured datasets. However, current methods struggle to represent long-range dependencies. Scaling the depth or width of GNNs is insufficient to broaden receptive fields as larger GNNs encounter optimization instabilities such as vanishing gradients and representation oversmoothing, while pooling-based approaches have yet to become as universally useful as in computer vision. In this work, we propose the use of Transformer-based self-attention to learn long-range pairwise relationships, with a novel "readout" mechanism to obtain a global graph embedding. Inspired by recent computer vision results that find position-invariant attention performant in learning long-range relationships, our method, which we call GraphTrans, applies a permutation-invariant Transformer module after a standard GNN module. This simple architecture leads to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Graph Neural Networks · Machine Learning in Materials Science · Graph Theory and Algorithms
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Adam · Residual Connection · Dropout · Label Smoothing · Layer Normalization · Dense Connections
