Tokenized Graph Transformer with Neighborhood Augmentation for Node Classification in Large Graphs
Jinsong Chen, Chang Liu, Kaiyuan Gao, Gaichao Li, Kun He

TL;DR
This paper introduces NAGphormer, a scalable graph transformer that uses neighborhood aggregation and tokenization to handle large graphs efficiently, outperforming existing methods in node classification tasks.
Contribution
The paper proposes NAGphormer, a novel graph transformer architecture with neighborhood tokenization and augmentation, enabling scalable and more informative node representations for large graphs.
Findings
NAGphormer outperforms existing graph Transformers and GNNs on benchmark datasets.
The Hop2Token module effectively aggregates multi-hop neighborhood features.
Neighborhood Augmentation (NrAug) enhances model performance further.
Abstract
Graph Transformers, emerging as a new architecture for graph representation learning, suffer from the quadratic complexity on the number of nodes when handling large graphs. To this end, we propose a Neighborhood Aggregation Graph Transformer (NAGphormer) that treats each node as a sequence containing a series of tokens constructed by our proposed Hop2Token module. For each node, Hop2Token aggregates the neighborhood features from different hops into different representations, producing a sequence of token vectors as one input. In this way, NAGphormer could be trained in a mini-batch manner and thus could scale to large graphs. Moreover, we mathematically show that compared to a category of advanced Graph Neural Networks (GNNs), called decoupled Graph Convolutional Networks, NAGphormer could learn more informative node representations from multi-hop neighborhoods. In addition, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms
MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Layer Normalization · Laplacian EigenMap · Byte Pair Encoding · Dropout · Linear Layer · Label Smoothing · Position-Wise Feed-Forward Layer
