GRAPES: Learning to Sample Graphs for Scalable Graph Neural Networks
Taraneh Younesian, Daniel Daza, Emile van Krieken, Thiviyan Thanapalasingam, Peter Bloem

TL;DR
GRAPES introduces an adaptive, learnable sampling method for GNNs that improves scalability and accuracy across diverse graph types by dynamically identifying crucial nodes during training.
Contribution
It proposes GRAPES, a novel adaptive sampling technique that trains a secondary GNN to predict node importance, enhancing GNN scalability and performance on various graph structures.
Findings
GRAPES outperforms static sampling methods in accuracy and scalability.
It maintains high accuracy with smaller sample sizes.
Effective on both homophilous and heterophilous graphs.
Abstract
Graph neural networks (GNNs) learn to represent nodes by aggregating information from their neighbors. As GNNs increase in depth, their receptive field grows exponentially, leading to high memory costs. Several existing methods address this by sampling a small subset of nodes, scaling GNNs to much larger graphs. These methods are primarily evaluated on homophilous graphs, where neighboring nodes often share the same label. However, most of these methods rely on static heuristics that may not generalize across different graphs or tasks. We argue that the sampling method should be adaptive, adjusting to the complex structural properties of each graph. To this end, we introduce GRAPES, an adaptive sampling method that learns to identify the set of nodes crucial for training a GNN. GRAPES trains a second GNN to predict node sampling probabilities by optimizing the downstream task objective.…
Peer Reviews
Decision·Submitted to ICLR 2024
Compared with previous models such as AS-GCN, the sampling method of this model is more "adaptive". Although the model is much more complex than previous models, it does show some performance improvement.
1. The description of the algorithm is problematic. As far as I know, GFlowNet is a method proposed to sample for an energy-based model: it addresses a distribution approximation problem. It is a special case of an RL algorithm. In my view, the paper is a pure RL problem especially since the reward function is clearly defined. I think an RL formulation is straightforward from that. The formulation with GFlowNet is very misleading -- I spent hours before realizing that this is not a distribution
- The motivating ideas for the proposed sampling framework are very novel. They are a great example of integrating different concepts/techniques from modern deep learning to solve a relevant problem --- scalability of GNNs. - Although this is primarily an algorithm-based/application paper, the model, the algorithm, and the training mechanism are theoretically grounded, and the authors did a very good job at motivating and explaining the reasons behind their design choices. - The numerical result
- Some related work is missing, and perhaps also a comparison with other graph sampling baselines from the graph signal processing literature. Check, e.g., "Efficient Sampling Set Selection for Bandlimited Graph Signals Using Graph Spectral Proxies", by Anis and others, and papers therein (specifically, the works of Kovacevic and Moura; Chamon and Ribeiro; Segarra, Marques and Ribeiro; etc.). These papers are part of a subfield of graph signal processing---graph signal sampling---which studies h
1. The method seems to have an improvement on F1 scores of GRAPES on most of the datasets compared to the other algorithms in the presented experimental setup. It has also proved to consume much less memory compared to GAS which has a different non-sampling strategy to reduce the scalability problem in large graphs. Although, it outperforms GRAPES in some datasets. 1. Different types of results such as the F1 scores, GPU memory allocation, robustness and entropy are provided to demonstrate the
1. **Experimental Setup**: In the presented setup, the proposed method outperforms the baselines. However, a few things about the setup are not clear: 1. It is not clear if the baselines were tuned on a validation set. Why was the batch size fixed to 256 for the main results table? 1. A related concern is the appearance of low F1-score compared to what is reported in other paper. Granted that this is in the transductive setting, I am not sure, if that should cause such decrease in perfor
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Machine Learning in Materials Science · Graph Theory and Algorithms
MethodsFocus
