Retrofitting Temporal Graph Neural Networks with Transformer

Qiang Huang; Xiao Yan; Xin Wang; Susie Xi Rao; Zhichao Han; Fangcheng; Fu; Wentao Zhang; Jiawei Jiang

arXiv:2409.05477·cs.LG·September 19, 2024

Retrofitting Temporal Graph Neural Networks with Transformer

Qiang Huang, Xiao Yan, Xin Wang, Susie Xi Rao, Zhichao Han, Fangcheng, Fu, Wentao Zhang, Jiawei Jiang

PDF

Open Access 1 Repo

TL;DR

This paper introduces TF-TGN, a novel approach that integrates Transformer decoders into temporal graph neural networks, enabling faster training and maintaining high accuracy by leveraging advanced Transformer techniques and parallelization strategies.

Contribution

The paper proposes TF-TGN, the first to unify TGNNs with Transformer decoders, improving training efficiency and scalability while achieving comparable or better accuracy than existing methods.

Findings

01

TF-TGN accelerates training by over 2.20 times.

02

It maintains comparable or superior accuracy to state-of-the-art TGNNs.

03

The approach effectively leverages Transformer kernels and parallelization for temporal graphs.

Abstract

Temporal graph neural networks (TGNNs) outperform regular GNNs by incorporating time information into graph-based operations. However, TGNNs adopt specialized models (e.g., TGN, TGAT, and APAN ) and require tailored training frameworks (e.g., TGL and ETC). In this paper, we propose TF-TGN, which uses Transformer decoder as the backbone model for TGNN to enjoy Transformer's codebase for efficient training. In particular, Transformer achieves tremendous success for language modeling, and thus the community developed high-performance kernels (e.g., flash-attention and memory-efficient attention) and efficient distributed training schemes (e.g., PyTorch FSDP, DeepSpeed, and Megatron-LM). We observe that TGNN resembles language modeling, i.e., the message aggregation operation between chronologically occurring nodes and their temporal neighbors in TGNNs can be structured as sequence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

qianghuangwhu/tf-tgn
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Neural Networks and Applications

MethodsAttention Is All You Need · Byte Pair Encoding · Absolute Position Encodings · Softmax · Label Smoothing · Dropout · Layer Normalization · Temporal Graph Network · Position-Wise Feed-Forward Layer · Linear Layer