Supercharging Graph Transformers with Advective Diffusion

Qitian Wu; Chenxiao Yang; Kaipeng Zeng; Michael Bronstein

arXiv:2310.06417·cs.LG·June 24, 2025

Supercharging Graph Transformers with Advective Diffusion

Qitian Wu, Chenxiao Yang, Kaipeng Zeng, Michael Bronstein

PDF

Open Access

TL;DR

AdvDIFFormer is a physics-inspired graph Transformer that improves generalization under topological shifts by modeling continuous message passing through advective diffusion equations, outperforming traditional graph neural networks.

Contribution

The paper introduces AdvDIFFormer, a novel graph Transformer based on advective diffusion equations, with provable generalization capabilities under topological shifts.

Findings

01

Outperforms existing models in information networks, molecular screening, and protein interaction tasks.

02

Provides theoretical guarantees for controlling generalization error under topological shifts.

03

Demonstrates superior empirical performance across diverse graph-based predictive tasks.

Abstract

The capability of generalization is a cornerstone for the success of modern learning systems. For non-Euclidean data, e.g., graphs, that particularly involves topological structures, one important aspect neglected by prior studies is how machine learning models generalize under topological shifts. This paper proposes Advective Diffusion Transformer (AdvDIFFormer), a physics-inspired graph Transformer model designed to address this challenge. The model is derived from advective diffusion equations which describe a class of continuous message passing process with observed and latent topological structures. We show that AdvDIFFormer has provable capability for controlling generalization error with topological shifts, which in contrast cannot be guaranteed by graph diffusion models, i.e., the generalized formulation of common graph neural networks in continuous space. Empirically, the model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Softmax · Byte Pair Encoding · Label Smoothing · Adam · Absolute Position Encodings · Residual Connection