Tokenphormer: Structure-aware Multi-token Graph Transformer for Node Classification
Zijie Zhou, Zhaoqi Lu, Xuekai Wei, Rongqin Chen, Shenghui Zhang, Pak, Lon Ip, Leong Hou U

TL;DR
Tokenphormer introduces a structure-aware multi-token graph transformer that captures local, structural, and global information at multiple granularities, achieving state-of-the-art node classification results.
Contribution
It proposes a novel multi-token approach inspired by NLP, combining walk, SGPM, and hop tokens to enhance structural and contextual learning in graph transformers.
Findings
Achieves state-of-the-art performance on node classification.
Effectively captures local and global graph information.
Outperforms existing GNNs and graph transformers.
Abstract
Graph Neural Networks (GNNs) are widely used in graph data mining tasks. Traditional GNNs follow a message passing scheme that can effectively utilize local and structural information. However, the phenomena of over-smoothing and over-squashing limit the receptive field in message passing processes. Graph Transformers were introduced to address these issues, achieving a global receptive field but suffering from the noise of irrelevant nodes and loss of structural information. Therefore, drawing inspiration from fine-grained token-based representation learning in Natural Language Processing (NLP), we propose the Structure-aware Multi-token Graph Transformer (Tokenphormer), which generates multiple tokens to effectively capture local and structural information and explore global information at different levels of granularity. Specifically, we first introduce the walk-token generated by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Graph Neural Networks
MethodsAttention Is All You Need · Laplacian EigenMap · Linear Layer · Byte Pair Encoding · Absolute Position Encodings · Dense Connections · Multi-Head Attention · Position-Wise Feed-Forward Layer · Label Smoothing · Laplacian Positional Encodings
