Massive Activations in Graph Neural Networks: Decoding Attention for   Domain-Dependent Interpretability

Lorenzo Bini; Marco Sorbi; Stephane Marchand-Maillet

arXiv:2409.03463·cs.LG·March 10, 2025

Massive Activations in Graph Neural Networks: Decoding Attention for Domain-Dependent Interpretability

Lorenzo Bini, Marco Sorbi, Stephane Marchand-Maillet

PDF

Open Access 1 Repo

TL;DR

This paper uncovers Massive Activations in attention-based GNNs, showing they encode domain-specific signals and can be used for interpretability, especially in molecular graphs.

Contribution

It introduces a novel method to detect Massive Activations in edge-featured GNNs and links these activations to domain-relevant information, enhancing interpretability.

Findings

01

Massive Activations correlate with common bond types in molecules

02

MAs can serve as attribution indicators for less informative edges

03

The method is validated on benchmark datasets like ZINC, TOX21, and PROTEINS.

Abstract

Graph Neural Networks (GNNs) have become increasingly popular for effectively modeling graph-structured data, and attention mechanisms have been pivotal in enabling these models to capture complex patterns. In our study, we reveal a critical yet underexplored consequence of integrating attention into edge-featured GNNs: the emergence of Massive Activations (MAs) within attention layers. By developing a novel method for detecting MAs on edge features, we show that these extreme activations are not only activation anomalies but encode domain-relevant signals. Our post-hoc interpretability analysis demonstrates that, in molecular graphs, MAs aggregate predominantly on common bond types (e.g., single and double bonds) while sparing more informative ones (e.g., triple bonds). Furthermore, our ablation studies confirm that MAs can serve as natural attribution indicators, reallocating to less…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

msorbi/gnn-ma
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsAttention Is All You Need · Byte Pair Encoding · Absolute Position Encodings · Laplacian EigenMap · Softmax · Label Smoothing · Dropout · Layer Normalization · Position-Wise Feed-Forward Layer · Linear Layer