Distinguished In Uniform: Self Attention Vs. Virtual Nodes

Eran Rosenbluth; Jan T\"onshoff; Martin Ritzert; Berke Kisin; Martin; Grohe

arXiv:2405.11951·cs.LG·May 21, 2024

Distinguished In Uniform: Self Attention Vs. Virtual Nodes

Eran Rosenbluth, Jan T\"onshoff, Martin Ritzert, Berke Kisin, Martin, Grohe

PDF

Open Access 1 Repo

TL;DR

This paper compares the expressivity of Graph Transformers with self-attention and Virtual Nodes, showing neither can universally approximate functions across all graph sizes, with experiments on synthetic and real data.

Contribution

It provides a theoretical comparison of the uniform expressivity of GTs and MPGNNs with Virtual Nodes, revealing neither model subsumes the other's capabilities.

Findings

01

Neither model is a uniform-universal approximator.

02

The models' expressivity does not subsume each other.

03

Experimental results show mixed practical performance.

Abstract

Graph Transformers (GTs) such as SAN and GPS are graph processing models that combine Message-Passing GNNs (MPGNNs) with global Self-Attention. They were shown to be universal function approximators, with two reservations: 1. The initial node features must be augmented with certain positional encodings. 2. The approximation is non-uniform: Graphs of different sizes may require a different approximating network. We first clarify that this form of universality is not unique to GTs: Using the same positional encodings, also pure MPGNNs and even 2-layer MLPs are non-uniform universal approximators. We then consider uniform expressivity: The target function is to be approximated by a single network for graphs of all sizes. There, we compare GTs to the more efficient MPGNN + Virtual Node architecture. The essential difference between the two model definitions is in their global computation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

toenshoff/vn-vs-gt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Bayesian Modeling and Causal Inference

MethodsGoal-Driven Tree-Structured Neural Model · Greedy Policy Search