Towards Mechanistic Interpretability of Graph Transformers via Attention   Graphs

Batu El; Deepro Choudhury; Pietro Li\`o; Chaitanya K. Joshi

arXiv:2502.12352·cs.LG·February 26, 2025·3 cites

Towards Mechanistic Interpretability of Graph Transformers via Attention Graphs

Batu El, Deepro Choudhury, Pietro Li\`o, Chaitanya K. Joshi

PDF

Open Access 1 Repo

TL;DR

This paper introduces Attention Graphs, a tool for understanding how Graph Transformers process information by analyzing attention patterns, revealing that learned attention structures often differ from original graphs and vary across models.

Contribution

It presents Attention Graphs as a novel interpretability method linking message passing in GNNs to self-attention in Transformers, with extensive analysis on different graph types.

Findings

01

Attention Graphs reveal diverse information flow patterns.

02

Learned attention structures often do not match original graph structures.

03

Different Transformer variants can achieve similar performance with distinct attention patterns.

Abstract

We introduce Attention Graphs, a new tool for mechanistic interpretability of Graph Neural Networks (GNNs) and Graph Transformers based on the mathematical equivalence between message passing in GNNs and the self-attention mechanism in Transformers. Attention Graphs aggregate attention matrices across Transformer layers and heads to describe how information flows among input nodes. Through experiments on homophilous and heterophilous node classification tasks, we analyze Attention Graphs from a network science perspective and find that: (1) When Graph Transformers are allowed to learn the optimal graph structure using all-to-all attention among input nodes, the Attention Graphs learned by the model do not tend to correlate with the input/original graph structure; and (2) For heterophilous graphs, different Graph Transformer variants can achieve similar performance while utilising…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

batu-el/understanding-inductive-biases-of-gnns
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Biomedical Text Mining and Ontologies

MethodsAttention Is All You Need · Absolute Position Encodings · Dense Connections · Linear Layer · Layer Normalization · Byte Pair Encoding · Residual Connection · Laplacian EigenMap · Label Smoothing · Multi-Head Attention