Rethinking Tokenized Graph Transformers for Node Classification
Jinsong Chen, Chenyang Li, GaiChao Li, John E. Hopcroft, Kun He

TL;DR
This paper introduces SwapGT, a novel token swapping method for tokenized graph Transformers that enhances node classification by generating more informative token sequences and employing a center alignment loss.
Contribution
SwapGT leverages a new token swapping operation and a center alignment loss to improve tokenized graph Transformers for node classification.
Findings
SwapGT outperforms existing methods on multiple datasets.
The token swapping operation increases the diversity of token sequences.
Center alignment loss enhances representation learning.
Abstract
Node tokenized graph Transformers (GTs) have shown promising performance in node classification. The generation of token sequences is the key module in existing tokenized GTs which transforms the input graph into token sequences, facilitating the node representation learning via Transformer. In this paper, we observe that the generations of token sequences in existing GTs only focus on the first-order neighbors on the constructed similarity graphs, which leads to the limited usage of nodes to generate diverse token sequences, further restricting the potential of tokenized GTs for node classification. To this end, we propose a new method termed SwapGT. SwapGT first introduces a novel token swapping operation based on the characteristics of token sequences that fully leverages the semantic relevance of nodes to generate more informative token sequences. Then, SwapGT leverages a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Graph Neural Networks · Machine Learning and ELM · Privacy-Preserving Technologies in Data
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer · Adam · Softmax · Absolute Position Encodings · Dropout · Label Smoothing · Byte Pair Encoding
