Do Transformers Really Perform Bad for Graph Representation?
Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di, He, Yanming Shen, Tie-Yan Liu

TL;DR
This paper introduces Graphormer, a Transformer-based model for graph representation learning that effectively encodes structural information, achieving state-of-the-art results and unifying many GNN variants.
Contribution
The paper presents Graphormer with novel structural encoding methods, demonstrating its superior performance and theoretical expressive power compared to existing GNNs.
Findings
Graphormer attains excellent results on large-scale graph benchmarks.
Structural encoding methods significantly improve Transformer performance on graphs.
Many GNN variants are special cases of Graphormer with appropriate encodings.
Abstract
The Transformer architecture has become a dominant choice in many domains, such as natural language processing and computer vision. Yet, it has not achieved competitive performance on popular leaderboards of graph-level prediction compared to mainstream GNN variants. Therefore, it remains a mystery how Transformers could perform well for graph representation learning. In this paper, we solve this mystery by presenting Graphormer, which is built upon the standard Transformer architecture, and could attain excellent results on a broad range of graph representation learning tasks, especially on the recent OGB Large-Scale Challenge. Our key insight to utilizing Transformer in the graph is the necessity of effectively encoding the structural information of a graph into the model. To this end, we propose several simple yet effective structural encoding methods to help Graphormer better model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Graphormer - Do Transformers Really Perform Bad for Graph Representation? | Paper Explained· youtube
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling · Epigenetics and DNA Methylation
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Label Smoothing · Residual Connection · Dense Connections
