GraphCliff: Short-Long Range Gating for Subtle Differences but Critical Changes
Hajung Kim, Jueon Park, Junseok Choe, Sheunheun Baek, Hyeon Hwang, Jaewoo Kang

TL;DR
GraphCliff introduces a gating mechanism to integrate short- and long-range information in molecular graph neural networks, improving their ability to distinguish structurally similar molecules with different activities, especially activity cliffs.
Contribution
The paper proposes GraphCliff, a novel model that enhances graph neural networks with short-long range gating to better capture subtle but critical molecular differences.
Findings
GraphCliff outperforms baseline models on activity cliff datasets.
Layer-wise analysis shows reduced over-smoothing in GraphCliff.
Enhanced discriminative power in molecular embedding space.
Abstract
Quantitative structure-activity relationship assumes a smooth relationship between molecular structure and biological activity. However, activity cliffs defined as pairs of structurally similar compounds with large potency differences break this continuity. Recent benchmarks targeting activity cliffs have revealed that classical machine learning models with extended connectivity fingerprints outperform graph neural networks. Our analysis shows that graph embeddings fail to adequately separate structurally similar molecules in the embedding space, making it difficult to distinguish between structurally similar but functionally different molecules. Despite this limitation, molecular graph structures are inherently expressive and attractive, as they preserve molecular topology. To preserve the structural representation of molecules as graphs, we propose a new model, GraphCliff, which…
Peer Reviews
Decision·Submitted to ICLR 2026
1. (Originality) As noted in this paper, methods for modeling long-range dependencies have been studied in several fields, such as genome language models. Approaches integrating local and global interactions have been studied, including the Hyena Hierarchy. However, to my knowledge, this paper is the first to apply this idea to the long-range dependency problem in graph learning. 2. (Quality) The paper deals with the important and specific problem of the activity cliff. Furthermore, the numerica
1. The discussion of the numerical experiment results has room for improvement. Specifically, I question whether the presentation of Table 1 is appropriate. This table only highlights the datasets where the proposed method achieves the highest accuracy. Results for the remaining datasets are provided in the appendix. However, if I do not miss any information, no validation or discussion regarding them is presented. 2. While the paper claims the proposed method achieves the best overall performan
1. The empirical results are strong and were done on 30 benchmarks, nonetheless all of them stem from the same dataset. 2. The visualization obtained from the gating mechanism seems to provide interesting insights that align with domain priors 3. The approach to combine high and low frequency signals with gating is intuitive and simple. 4. The authors provide ablation studies.
1. The approach is tailored and demonstrated to molecules, and it is not clear whether other domain can benefit from it. 2. The empirical evaluation although very extensive, focuses only on ChEMBL, and it remains unknown if this method is also beneficial to other domains or datasets. Evaluating it on other diverse benchmarks from other domains and other tasks may be more convincing on the merits of this work. 3. Based on the two above comments, it is possible the contribution is incremental as i
- The paper tackles an important and challenging domain-specific issue—*activity cliffs*—that conventional GNNs struggle with. - The proposed gating design effectively balances local and global information, reducing over-smoothing while preserving local sensitivity. - The experiments are thorough, including 30 datasets, multiple baselines, ablation and interpretability analyses (e.g., Hop-wise sensitivity, Dirichlet energy, …).
- Although the overall idea of integrating short- and long-range information is reasonable, the novelty of the approach is somewhat limited, as similar hybrid architectures (e.g., GROVER, GraphTrans) have already been proposed. The paper should more clearly articulate how GraphCliff’s gating design provides advantages specific to molecular *activity cliff* prediction. - Minor issues in figure and notation: - Figure 1 (Overall architecture of GraphCliff) is visually unrefined and lacks clear
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Computational Drug Discovery Methods · Bioinformatics and Genomic Networks
