TL;DR
This paper enhances transformer architectures aligned with the Weisfeiler-Leman hierarchy, improving their expressivity and practicality for graph tasks, and demonstrates competitive performance on large-scale and molecular datasets.
Contribution
It advances the alignment of transformers with the $k$-WL hierarchy, providing stronger theoretical expressivity results and practical feasibility, along with a framework for studying positional encodings.
Findings
Stronger expressivity results for transformers aligned with $k$-WL.
Competitive performance on PCQM4Mv2 dataset.
Effective fine-tuning on small molecular datasets.
Abstract
Graph neural network architectures aligned with the -dimensional Weisfeiler--Leman (-WL) hierarchy offer theoretically well-understood expressive power. However, these architectures often fail to deliver state-of-the-art predictive performance on real-world graphs, limiting their practical utility. While recent works aligning graph transformer architectures with the -WL hierarchy have shown promising empirical results, employing transformers for higher orders of remains challenging due to a prohibitive runtime and memory complexity of self-attention as well as impractical architectural assumptions, such as an infeasible number of attention heads. Here, we advance the alignment of transformers with the -WL hierarchy, showing stronger expressivity results for each , making them more feasible in practice. In addition, we develop a theoretical framework that allows the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSoftmax · Laplacian EigenMap · Layer Normalization · Laplacian Positional Encodings · Linear Layer · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Label Smoothing · Adam · Attention Is All You Need
