Rethinking Graph Transformers with Spectral Attention
Devin Kreuzer, Dominique Beaini, William L. Hamilton, Vincent, L\'etourneau, Prudencio Tossou

TL;DR
This paper introduces the Spectral Attention Network (SAN), a graph Transformer that uses spectral positional encodings to improve graph representation and outperform existing models on benchmark datasets.
Contribution
The paper proposes a novel spectral positional encoding method for graph Transformers, enabling full connectivity and better structural discrimination.
Findings
SAN outperforms state-of-the-art GNNs on benchmark datasets.
SAN significantly outperforms existing attention-based models.
The spectral encoding enhances the model's ability to distinguish graph structures.
Abstract
In recent years, the Transformer architecture has proven to be very successful in sequence processing, but its application to other data structures, such as graphs, has remained limited due to the difficulty of properly defining positions. Here, we present the (SAN), which uses a learned positional encoding (LPE) that can take advantage of the full Laplacian spectrum to learn the position of each node in a given graph. This LPE is then added to the node features of the graph and passed to a fully-connected Transformer. By leveraging the full spectrum of the Laplacian, our model is theoretically powerful in distinguishing graphs, and can better detect similar sub-structures from their resonance. Further, by fully connecting the graph, the Transformer does not suffer from over-squashing, an information bottleneck of most GNNs, and enables better…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Machine Learning in Materials Science
MethodsAttention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam · Label Smoothing · Residual Connection · Dense Connections · Softmax · Dropout
