A Pure Transformer Pretraining Framework on Text-attributed Graphs

Yu Song; Haitao Mao; Jiachen Xiao; Jingzhe Liu; Zhikai Chen; Wei Jin,; Carl Yang; Jiliang Tang; Hui Liu

arXiv:2406.13873·cs.AI·June 21, 2024·1 cites

A Pure Transformer Pretraining Framework on Text-attributed Graphs

Yu Song, Haitao Mao, Jiachen Xiao, Jingzhe Liu, Zhikai Chen, Wei Jin,, Carl Yang, Jiliang Tang, Hui Liu

PDF

Open Access 1 Repo

TL;DR

This paper introduces GSPT, a Transformer-based pretraining framework for text-attributed graphs that leverages unified text features to improve transferability and performance in node classification and link prediction tasks.

Contribution

It proposes a novel feature-centric pretraining approach using Transformers that treats graph structure as a prior, enhancing transferability across graphs within the same domain.

Findings

01

GSPT outperforms traditional methods in transferability among graphs.

02

The framework achieves promising results on node classification and link prediction.

03

Unified text features reduce reliance on graph structure, improving generalization.

Abstract

Pretraining plays a pivotal role in acquiring generalized knowledge from large-scale data, achieving remarkable successes as evidenced by large models in CV and NLP. However, progress in the graph domain remains limited due to fundamental challenges such as feature heterogeneity and structural heterogeneity. Recently, increasing efforts have been made to enhance node feature quality with Large Language Models (LLMs) on text-attributed graphs (TAGs), demonstrating superiority to traditional bag-of-words or word2vec techniques. These high-quality node features reduce the previously critical role of graph structure, resulting in a modest performance gap between Graph Neural Networks (GNNs) and structure-agnostic Multi-Layer Perceptrons (MLPs). Motivated by this, we introduce a feature-centric pretraining perspective by treating graph structure as a prior and leveraging the rich, unified…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

songyyyy/gspt
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsLinear Layer · Multi-Head Attention · Residual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Position-Wise Feed-Forward Layer · Dropout · Adam