Model Metamers Reveal Invariances in Graph Neural Networks
Wei Xu, Xiaoyi Jiang, Lixiang Xu, Dechao Tang

TL;DR
This paper introduces a method to generate graph metamers for GNNs, revealing their high invariance levels and highlighting the gap with human perception, along with a new benchmark for evaluating GNN invariance.
Contribution
The paper presents a novel graph metamer generation technique to analyze invariance in GNNs and provides theoretical and empirical insights into their limitations.
Findings
GNNs exhibit extreme invariance to structural changes.
Targeted training strategies only partially reduce invariance.
Metamer graphs serve as a new benchmark for GNN evaluation.
Abstract
In recent years, deep neural networks have been extensively employed in perceptual systems to learn representations endowed with invariances, aiming to emulate the invariance mechanisms observed in the human brain. However, studies in the visual and auditory domains have confirmed that significant gaps remain between the invariance properties of artificial neural networks and those of humans. To investigate the invariance behavior within graph neural networks (GNNs), we introduce a model ``metamers'' generation technique. By optimizing input graphs such that their internal node activations match those of a reference graph, we obtain graphs that are equivalent in the model's representation space, yet differ significantly in both structure and node features. Our theoretical analysis focuses on two aspects: the local metamer dimension for a single node and the activation-induced volume…
Peer Reviews
Decision·Submitted to ICLR 2026
- The topic is original and timely, relating to human alignment and connecting ideas from neuroscience and machine learning. - The proposed framework is interesting and novel, though the treatment of structure metamers is less developed than that of feature metamers. - The manifold formalization in eqn. (3) is conceptually strong. - Feature metamer generation is feasible in practice since the learnable parameters used in their generation do not depend on graph size. - The connection between inv
- Structure metamers are treated superficially. Their generation scales poorly for large graphs, and the paper does not address this. - Only one feature distance (cosine similarity) and one graph distance (Weisfeiler-Lehman kernel) are considered. Both choices limit the conclusions. Cosine similarity is distribution-aware but removes differences in scale, which can be important (e.g., in Cora, it removes the distinction between very frequent and infrequent words). Another issue is using only the
1. (Originality) Consistency scores are defined as a novel method to evaluate GNN invariance. 2. (Quality) The numerical experiments cover a wide range of datasets and models. Ablation studies are conducted to assess the sensitivity of the score to hyperparameters. 3. (Clarity) The writing is clear. The paper's structure is appropriate, and I had no difficulty understanding the individual explanations. 4. (Significance) By using a straight-through estimator, the method can handle binary features
1. The problem this paper aims to solve seems unclear to me. While the purpose of investigating GNN invariance is stated, if I do not miss any information, the paper does not explicitly explain why this investigation is important and what specific problems it solves. Although the issue of the gap between GNN and human perception is suggested in the introduction, it remains unaddressed throughout the paper. 2. The motivation for the evaluation metric is also unclear. The numerical experiments use
The paper is well-written, and it is easy to follow. There are numerous experiments (multiple GNN architectures and benchmarks) and ablation studies, including different architectural changes and hidden dimensions. The issue of representational invariance in GNNs is a significant research topic.
The GNNs used are older architectures, and the datasets are small. It is unclear whether the conclusions derived from the experiments hold for modern GNNs and larger graph datasets. The recipes to mitigate model invariance are too few. Experiments on structural metamers are too few and inconclusive. The performance drop in the cross-architecture metamers experiment seems marginal in most cases, yet it is described as "substantial", "disruptive", and "showing a dramatic drop in accuracy".
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Face Recognition and Perception · Functional Brain Connectivity Studies
