Generative Graph Pattern Machine
Zehong Wang, Zheyuan Zhang, Tianyi Ma, Chuxu Zhang, Yanfang Ye

TL;DR
G$^2$PM introduces a generative Transformer framework for graphs that overcomes message-passing limitations, enabling scalable, transferable representations across various graph learning tasks.
Contribution
It presents a novel generative pre-training approach for graphs using Transformers, surpassing message-passing GNNs in scalability and performance.
Findings
G$^2$PM scales effectively to 60M parameters, outperforming smaller models.
It improves performance across node, link, and graph classification tasks.
The model demonstrates strong transfer learning and cross-graph pretraining capabilities.
Abstract
Graph neural networks (GNNs) have been predominantly driven by message-passing, where node representations are iteratively updated via local neighborhood aggregation. Despite their success, message-passing suffers from fundamental limitations -- including constrained expressiveness, over-smoothing, over-squashing, and limited capacity to model long-range dependencies. These issues hinder scalability: increasing data size or model size often fails to yield improved performance. To this end, we explore pathways beyond message-passing and introduce Generative Graph Pattern Machine (GPM), a generative Transformer pre-training framework for graphs. GPM represents graph instances (nodes, edges, or entire graphs) as sequences of substructures, and employs generative pre-training over the sequences to learn generalizable and transferable representations. Empirically, GPM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Topic Modeling
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Multi-Head Attention · Dense Connections · Softmax · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Residual Connection · Byte Pair Encoding
