Structure-Aware Transformer for Graph Representation Learning
Dexiong Chen, Leslie O'Bray, Karsten Borgwardt

TL;DR
This paper introduces the Structure-Aware Transformer, enhancing graph representation learning by integrating structural information into self-attention, leading to state-of-the-art results and better capturing structural similarities.
Contribution
It proposes a novel self-attention mechanism that incorporates subgraph structures, improving the expressiveness and performance of graph Transformers.
Findings
Achieves state-of-the-art results on five graph prediction benchmarks.
Systematically improves performance when combined with existing GNNs.
Theoretically guarantees at least as expressive representations as subgraph-based methods.
Abstract
The Transformer architecture has gained growing attention in graph representation learning recently, as it naturally overcomes several limitations of graph neural networks (GNNs) by avoiding their strict structural inductive biases and instead only encoding the graph structure via positional encoding. Here, we show that the node representations generated by the Transformer with positional encoding do not necessarily capture structural similarity between them. To address this issue, we propose the Structure-Aware Transformer, a class of simple and flexible graph Transformers built upon a new self-attention mechanism. This new self-attention incorporates structural information into the original self-attention by extracting a subgraph representation rooted at each node before computing the attention. We propose several methods for automatically generating the subgraph representation and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Label Smoothing · Layer Normalization · Byte Pair Encoding · Dense Connections · Residual Connection · Absolute Position Encodings · Softmax
