P-DROP: Poisson-Based Dropout for Graph Neural Networks
Hyunsik Yun

TL;DR
This paper introduces P-DROP, a Poisson process-based node selection method for GNNs that mitigates over-smoothing by enabling stochastic, structure-aware updates, improving training effectiveness and accuracy.
Contribution
It proposes a novel Poisson-based node update strategy for GNNs, offering a new regularization and training scheme that enhances model performance and structural diversity.
Findings
Achieves competitive or better accuracy on benchmark datasets
Reduces over-smoothing in GNN training
Effective as both regularization and dynamic subgraph training
Abstract
Over-smoothing remains a major challenge in Graph Neural Networks (GNNs), where repeated message passing causes node representations to converge and lose discriminative power. To address this, we propose a novel node selection strategy based on Poisson processes, introducing stochastic but structure-aware updates. Specifically, we equip each node with an independent Poisson clock, enabling asynchronous and localized updates that preserve structural diversity. We explore two applications of this strategy: as a replacement for dropout-based regularization and as a dynamic subgraph training scheme. Experimental results on standard benchmarks (Cora, Citeseer, Pubmed) demonstrate that our Poisson-based method yields competitive or improved accuracy compared to traditional Dropout, DropEdge, and DropNode approaches, particularly in later training stages.
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
- Proposes a novel, Poisson-clock–based node-selection mechanism for GNN dropout. - Presents a more principled alternative to uniform, heuristic drop schemes by grounding stochasticity in probability theory. - The asynchronous, sparse activation design is a promising direction for alleviating over-smoothing in deeper GNNs
The core idea is clear, but the writing and experimental evaluation need substantial improvement: - Underspecified math: Key parameters (λ, α) are neither clearly defined nor justified. - Figure 1 is there but never got mentioned. - Naming inconsistency: The method is referred to as both “P-DROP” and “SGNN,” causing confusion; standardize terminology. - Core claim untested: Over-smoothing mitigation is asserted but not evaluated on deep GNNs; include 8–32-layer studies. - Narrow benchmarks & wea
The proposed idea of asynchronously updating nodes is interesting.
- The paper is haphazardly presented with countless grammatical errors, - There is limited technical innovation in the proposed method. - The mathematical reasoning is very informal. - The performance of the proposed method is not better. - Only outdated, small benchmark datasets are used.
1. It fills the gap of poor effectiveness and performance of dropout methods in GNN training. 2. The proposed node selection method fully considers the structural features of the graph, rather than relying solely on uniform sampling as in previous work. 3. Experimental results validate the effectiveness of the proposed method.
1. The paper lacks an in-depth analysis of the proposed method and only presents the conclusion and results. 2. Some of the experimental results in Table 1 are not optimal, and the improvements are extremely limited. 3. The experimental section lacks certain ablation experiments, such as the impact of each node’s λ value on performance and how to determine the λ value for each node. 4. The parameters of the baseline systems in the experiments are not mentioned, such as the dropout parameter in T
- The general idea to model node activations using some underlying stochastic process can make sense, and is interesting. - The degree-aware parametrization is interesting, and investigating this further may make sense.
- Novelty is limited: The Poisson clock is equivalent to per-node Bernoulli dropout with node-specific rates. Hence, P-DROP is close to non-uniform DropNode. - In all reported performance benchmarks, error estimates are missing and no advantage can be deduced (although apparently 10 runs were performed). - Two degree aware variants are discussed to either enhance or mitigate the influence of the graph structure on the node activation, but it does not become clear which is used, and what respecti
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Advanced Neural Network Applications · Graph Theory and Algorithms
