Not All Neighbors Matter: Understanding the Impact of Graph Sparsification on GNN Pipelines
Yuhang Song, Naima Abrar Shami, Romaric Duvignau, Vasiliki Kalavri

TL;DR
This paper investigates graph sparsification as a pre-processing step for GNNs, demonstrating it can improve performance and scalability while maintaining accuracy, especially on large graphs.
Contribution
It provides a systematic evaluation framework and the first comprehensive study on the effects of graph sparsification on GNN training and inference.
Findings
Sparsification often preserves or improves accuracy.
Benefits increase with graph scale, accelerating training and inference.
Sparsification overhead is quickly amortized, enabling practical large-scale use.
Abstract
As graphs scale to billions of nodes and edges, graph Machine Learning workloads are constrained by the cost of multi-hop traversals over exponentially growing neighborhoods. While various system-level and algorithmic optimizations have been proposed to accelerate Graph Neural Network (GNN) pipelines, data management and movement remain the primary bottlenecks at scale. In this paper, we explore whether graph sparsification, a well-established technique that reduces edges to create sparser neighborhoods, can serve as a lightweight pre-processing step to address these bottlenecks while preserving accuracy on node classification tasks. We develop an extensible experimental framework that enables systematic evaluation of how different sparsification methods affect the performance and accuracy of GNN models. We conduct the first comprehensive study of GNN training and inference on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Machine Learning in Healthcare
