Generalizable Insights for Graph Transformers in Theory and Practice
Timo Stoll, Luis M\"uller, Christopher Morris

TL;DR
This paper introduces the Generalized-Distance Transformer, a versatile Graph Transformer architecture, and provides extensive empirical and theoretical insights into its design choices, demonstrating strong performance across diverse large-scale graph tasks.
Contribution
The paper presents the GDT architecture that unifies recent advancements in GTs and offers a comprehensive analysis of its expressivity and empirical performance across multiple domains.
Findings
Design choices that improve generalization across tasks
Strong few-shot transfer performance without fine-tuning
Empirical validation on over eight million graphs
Abstract
Graph Transformers (GTs) have shown strong empirical performance, yet current architectures vary widely in their use of attention mechanisms, positional embeddings (PEs), and expressivity. Existing expressivity results are often tied to specific design choices and lack comprehensive empirical validation on large-scale data. This leaves a gap between theory and practice, preventing generalizable insights that exceed particular application domains. Here, we propose the Generalized-Distance Transformer (GDT), a GT architecture using standard attention that incorporates many advancements for GTs from recent years, and develop a fine-grained understanding of the GDT's representation power in terms of attention and PEs. Through extensive experiments, we identify design choices that consistently perform well across various applications, tasks, and model scales, demonstrating strong performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Multimodal Machine Learning Applications
