TL;DR
This paper introduces Edge-augmented Graph Transformer (EGT), a novel architecture that uses global self-attention with edge channels to effectively learn from graph-structured data, outperforming traditional graph neural networks on benchmarks.
Contribution
The paper presents EGT, a new graph learning model that replaces convolution with global self-attention and incorporates evolving edge channels for enhanced structural learning.
Findings
EGT outperforms message-passing GNNs on benchmark datasets.
EGT achieves state-of-the-art results on the OGB-LSC PCQM4Mv2 quantum-chemical regression task.
Global self-attention can effectively replace local convolutional aggregation in graph learning.
Abstract
We propose an extension to the transformer neural network architecture for general-purpose graph learning by adding a dedicated pathway for pairwise structural information, called edge channels. The resultant framework - which we call Edge-augmented Graph Transformer (EGT) - can directly accept, process and output structural information of arbitrary form, which is important for effective learning on graph-structured data. Our model exclusively uses global self-attention as an aggregation mechanism rather than static localized convolutional aggregation. This allows for unconstrained long-range dynamic interactions between nodes. Moreover, the edge channels allow the structural information to evolve from layer to layer, and prediction tasks on edges/links can be performed directly from the output embeddings of these channels. We verify the performance of EGT in a wide range of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Convolution · Edge-augmented Graph Transformer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Laplacian EigenMap · Adam · Laplacian Positional Encodings
