Do Transformers Really Perform Bad for Graph Representation?

Chengxuan Ying; Tianle Cai; Shengjie Luo; Shuxin Zheng; Guolin Ke; Di; He; Yanming Shen; Tie-Yan Liu

arXiv:2106.05234·cs.LG·November 25, 2021·128 cites

Do Transformers Really Perform Bad for Graph Representation?

Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di, He, Yanming Shen, Tie-Yan Liu

PDF

Open Access 5 Repos 2 Models 1 Video

TL;DR

This paper introduces Graphormer, a Transformer-based model for graph representation learning that effectively encodes structural information, achieving state-of-the-art results and unifying many GNN variants.

Contribution

The paper presents Graphormer with novel structural encoding methods, demonstrating its superior performance and theoretical expressive power compared to existing GNNs.

Findings

01

Graphormer attains excellent results on large-scale graph benchmarks.

02

Structural encoding methods significantly improve Transformer performance on graphs.

03

Many GNN variants are special cases of Graphormer with appropriate encodings.

Abstract

The Transformer architecture has become a dominant choice in many domains, such as natural language processing and computer vision. Yet, it has not achieved competitive performance on popular leaderboards of graph-level prediction compared to mainstream GNN variants. Therefore, it remains a mystery how Transformers could perform well for graph representation learning. In this paper, we solve this mystery by presenting Graphormer, which is built upon the standard Transformer architecture, and could attain excellent results on a broad range of graph representation learning tasks, especially on the recent OGB Large-Scale Challenge. Our key insight to utilizing Transformer in the graph is the necessity of effectively encoding the structural information of a graph into the model. To this end, we propose several simple yet effective structural encoding methods to help Graphormer better model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

Graphormer - Do Transformers Really Perform Bad for Graph Representation? | Paper Explained· youtube

Taxonomy

TopicsAdvanced Graph Neural Networks · Topic Modeling · Epigenetics and DNA Methylation

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Label Smoothing · Residual Connection · Dense Connections