PatchGT: Transformer over Non-trainable Clusters for Learning Graph Representations
Han Gao, Xu Han, Jiaoyang Huang, Jian-Xun Wang, Li-Ping Liu

TL;DR
PatchGT introduces a novel Transformer-based graph neural network that uses non-trainable spectral clustering to segment graphs into patches, enhancing efficiency and expressiveness in graph representation learning.
Contribution
The paper proposes PatchGT, a graph neural network that leverages spectral clustering for patch segmentation, improving performance and interpretability over previous trainable cluster methods.
Findings
PatchGT outperforms 1-WL-type GNNs in expressiveness.
It achieves competitive results on benchmark datasets.
The spectral clustering method is permutation invariant.
Abstract
Recently the Transformer structure has shown good performances in graph learning tasks. However, these Transformer models directly work on graph nodes and may have difficulties learning high-level information. Inspired by the vision transformer, which applies to image patches, we propose a new Transformer-based graph neural network: Patch Graph Transformer (PatchGT). Unlike previous transformer-based models for learning graph representations, PatchGT learns from non-trainable graph patches, not from nodes directly. It can help save computation and improve the model performance. The key idea is to segment a graph into patches based on spectral clustering without any trainable parameters, with which the model can first use GNN layers to learn patch-level representations and then use Transformer to obtain graph-level representations. The architecture leverages the spectral information of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Bioinformatics and Genomic Networks
MethodsMulti-Head Attention · Attention Is All You Need · Laplacian EigenMap · Layer Normalization · Laplacian Positional Encodings · Softmax · Adam · Dropout · Byte Pair Encoding · Position-Wise Feed-Forward Layer
