A Pure Transformer Pretraining Framework on Text-attributed Graphs
Yu Song, Haitao Mao, Jiachen Xiao, Jingzhe Liu, Zhikai Chen, Wei Jin,, Carl Yang, Jiliang Tang, Hui Liu

TL;DR
This paper introduces GSPT, a Transformer-based pretraining framework for text-attributed graphs that leverages unified text features to improve transferability and performance in node classification and link prediction tasks.
Contribution
It proposes a novel feature-centric pretraining approach using Transformers that treats graph structure as a prior, enhancing transferability across graphs within the same domain.
Findings
GSPT outperforms traditional methods in transferability among graphs.
The framework achieves promising results on node classification and link prediction.
Unified text features reduce reliance on graph structure, improving generalization.
Abstract
Pretraining plays a pivotal role in acquiring generalized knowledge from large-scale data, achieving remarkable successes as evidenced by large models in CV and NLP. However, progress in the graph domain remains limited due to fundamental challenges such as feature heterogeneity and structural heterogeneity. Recently, increasing efforts have been made to enhance node feature quality with Large Language Models (LLMs) on text-attributed graphs (TAGs), demonstrating superiority to traditional bag-of-words or word2vec techniques. These high-quality node features reduce the previously critical role of graph structure, resulting in a modest performance gap between Graph Neural Networks (GNNs) and structure-agnostic Multi-Layer Perceptrons (MLPs). Motivated by this, we introduce a feature-centric pretraining perspective by treating graph structure as a prior and leveraging the rich, unified…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsLinear Layer · Multi-Head Attention · Residual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Position-Wise Feed-Forward Layer · Dropout · Adam
