How Attentive are Graph Attention Networks?
Shaked Brody, Uri Alon, Eran Yahav

TL;DR
This paper critically examines Graph Attention Networks (GATs), revealing their limited static attention mechanism and introducing GATv2, a more expressive dynamic attention variant that outperforms GAT on multiple benchmarks.
Contribution
The paper identifies the limitations of GAT's static attention and proposes GATv2, a simple modification that enhances expressiveness and improves performance across various graph learning benchmarks.
Findings
GAT's attention scores are unconditioned on the query node.
GATv2 outperforms GAT on 11 OGB and other benchmarks.
GATv2 matches GAT's parametric costs while being more expressive.
Abstract
Graph Attention Networks (GATs) are one of the most popular GNN architectures and are considered as the state-of-the-art architecture for representation learning with graphs. In GAT, every node attends to its neighbors given its own representation as the query. However, in this paper we show that GAT computes a very limited kind of attention: the ranking of the attention scores is unconditioned on the query node. We formally define this restricted kind of attention as static attention and distinguish it from a strictly more expressive dynamic attention. Because GATs use a static attention mechanism, there are simple graph problems that GAT cannot express: in a controlled problem, we show that static attention hinders GAT from even fitting the training data. To remove this limitation, we introduce a simple fix by modifying the order of operations and propose GATv2: a dynamic graph…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling · Advanced Neural Network Applications
MethodsGraph Attention Network v2
