GraNNite: Enabling High-Performance Execution of Graph Neural Networks on Resource-Constrained Neural Processing Units
Arghadip Das, Shamik Kundu, Arnab Raha, Soumendu Ghosh, Deepak, Mathaikutty, Vijay Raghunathan

TL;DR
GraNNite is a hardware-aware framework that significantly accelerates graph neural network execution on resource-constrained devices by optimizing workload distribution, exploiting sparsity, and balancing accuracy with efficiency.
Contribution
It introduces a structured three-step methodology for optimizing GNNs on commercial off-the-shelf NPUs, including workload partitioning, performance enhancement, and accuracy-efficiency trade-offs.
Findings
Achieves 2.6X to 7.6X speedups over default NPU mappings.
Realizes up to 8.6X energy savings over CPUs and GPUs.
Delivers 10.8X and 6.7X higher performance than CPUs and GPUs, respectively.
Abstract
Graph Neural Networks (GNNs) are vital for learning from graph-structured data, enabling applications in network analysis, recommendation systems, and speech analytics. Deploying them on edge devices like client PCs and laptops enhances real-time processing, privacy, and cloud independence. GNNs aid Retrieval-Augmented Generation (RAG) for Large Language Models (LLMs) and enable event-based vision tasks. However, irregular memory access, sparsity, and dynamic structures cause high latency and energy overhead on resource-constrained devices. While modern edge processors integrate CPUs, GPUs, and NPUs, NPUs designed for data-parallel tasks struggle with irregular GNN computations. We introduce GraNNite, the first hardware-aware framework optimizing GNN execution on commercial-off-the-shelf (COTS) SOTA DNN accelerators via a structured three-step methodology: (1) enabling NPU execution,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Graph Theory and Algorithms · Advanced Graph Neural Networks
MethodsConvolution
