ParaFormer: A Generalized PageRank Graph Transformer for Graph Representation Learning

Chaohao Yuan; Zhenjie Song; Ercan Engin Kuruoglu; Kangfei Zhao; Yang Liu; Deli Zhao; Hong Cheng; Yu Rong

arXiv:2512.14619·cs.LG·December 17, 2025

ParaFormer: A Generalized PageRank Graph Transformer for Graph Representation Learning

Chaohao Yuan, Zhenjie Song, Ercan Engin Kuruoglu, Kangfei Zhao, Yang Liu, Deli Zhao, Hong Cheng, Yu Rong

PDF

Open Access

TL;DR

ParaFormer introduces a PageRank-enhanced attention mechanism in graph transformers to mitigate over-smoothing, leading to improved performance in node and graph classification tasks across diverse datasets.

Contribution

It proposes a novel PageRank-based attention module that acts as an adaptive-pass filter, effectively reducing over-smoothing in graph transformers.

Findings

01

ParaFormer outperforms existing models on 11 datasets.

02

Theoretical analysis confirms adaptive-pass filtering reduces over-smoothing.

03

Empirical results show consistent improvements in classification accuracy.

Abstract

Graph Transformers (GTs) have emerged as a promising graph learning tool, leveraging their all-pair connected property to effectively capture global information. To address the over-smoothing problem in deep GNNs, global attention was initially introduced, eliminating the necessity for using deep GNNs. However, through empirical and theoretical analysis, we verify that the introduced global attention exhibits severe over-smoothing, causing node representations to become indistinguishable due to its inherent low-pass filtering. This effect is even stronger than that observed in GNNs. To mitigate this, we propose PageRank Transformer (ParaFormer), which features a PageRank-enhanced attention module designed to mimic the behavior of deep Transformers. We theoretically and empirically demonstrate that ParaFormer mitigates over-smoothing by functioning as an adaptive-pass filter. Experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Domain Adaptation and Few-Shot Learning