Can You Hear Me Now? A Benchmark for Long-Range Graph Propagation
Luca Miglior, Matteo Tolloso, Alessio Gravina, Davide Bacciu

TL;DR
This paper introduces ECHO, a comprehensive benchmark for evaluating the ability of graph neural networks to capture long-range interactions, crucial for scientific applications, revealing significant performance gaps in current models.
Contribution
The paper presents ECHO, a novel benchmark with synthetic and real-world tasks designed to rigorously assess long-range propagation in GNNs, highlighting current limitations and guiding future improvements.
Findings
Popular GNNs show performance gaps on long-range tasks
Design choices can improve long-range information propagation
ECHO sets a new standard for evaluating GNNs in scientific applications
Abstract
Effectively capturing long-range interactions remains a fundamental yet unresolved challenge in graph neural network (GNN) research, critical for applications across diverse fields of science. To systematically address this, we introduce ECHO (Evaluating Communication over long HOps), a novel benchmark specifically designed to rigorously assess the capabilities of GNNs in handling very long-range graph propagation. ECHO includes three synthetic graph tasks, namely single-source shortest paths, node eccentricity, and graph diameter, each constructed over diverse and structurally challenging topologies intentionally designed to introduce significant information bottlenecks. ECHO also includes two real-world datasets, ECHO-Charge and ECHO-Energy, which define chemically grounded benchmarks for predicting atomic partial charges and molecular total energies, respectively, with reference…
Peer Reviews
Decision·ICLR 2026 Poster
- There is a gap of datasets on long rage testing in GNNs and although there exists the benchmarks that this paper discusses they have limitations. So the proposed datasets address this. - The datasets include both synthetic and real world graphs with the real ones standing a good contribution due to the curation as well as the performance trends. - Empirical results show that models arguably good for long range propagation such as GPS, A-DGN, SWAN, DRew, etc outperform MPNNs, with particularly
- The primary contribution of this work is the collection of molecular dataset as the similar synthetic dataset is known in the prior literature. - The strong claim in lines 146-147 about the molecular dataset on long range could be disputed since prior molecular dataset also involve long range affects although they could be synthetic targets. - The paper lacks close discussion/comparison/adaptation of known insights with 2 works which would make it stronger, particularly on quantifying ECHO dat
- The need for new long-range benchmarks and the shortcomings of existing ones are well-argued. - The tasks, both synthetic and real-world, are at least as well motivated as existing benchmarks; I can see ECHO becoming a new and sorely-needed standard benchmark for the community. - The paper is very well written, and Figures 1,2 and 3 are used to great effect to illustrate the tasks introduced; I think this will be a great boon to adoption. - The authors evidently put a lot of time and effort in
- [W1] The authors argue that the -Charge and -Energy tasks are 'inherently long-range' due to (i) their large size/diameter, 17-40 hops, and (ii) the underlying task; they say: - "The three-dimensional configuration of molecules greatly intensifies this task complexity, as distant atoms in the graph topology can still exert significant influence on electronic properties and total energy." - [W1.1] This makes sense to me, but do you have a source for it? - To put it another way; large diame
1. The paper is very well written and very clear. 2. The paper addresses an open problem consisting of finding a good long range benchmark for GNNs, which is particularly relevant given the recent criticisms of LRGB (hyperparameter tuning, and long-range nature of the task). 3. The hyperparameter tuning is more rigorous than previous work, directly addressing a criticism of previous benchmarks. 4. The ECHO-Charge and ECHO-Synth is original and may foster more work applying GNNs to chemistry appl
1. **Long range evidence for real world tasks is lacking** The evidence for the long-range nature of the real world benchmark (ECHO-Charge and ECHO-Energy) is limited. Indeed there seems to be no correlation between depth and performance (Fig 7d). The only argument is that long-range architectures (such as GPS and SWAN) outperform standard supposedly not long range architectures. This argument is very similar to the LRGB arguments which were shown to be limited. On the other hand, the hyperparam
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Machine Learning in Materials Science · Complex Network Analysis Techniques
