DF-GNN: Dynamic Fusion Framework for Attention Graph Neural Networks on GPUs
Jiahui Liu, Zhenkun Cai, Zhiyong Chen, Minjie Wang

TL;DR
DF-GNN introduces a dynamic kernel fusion framework with bi-level thread scheduling for attention-based GNNs on GPUs, significantly improving training efficiency and outperforming existing systems.
Contribution
The paper presents a novel dynamic kernel fusion framework with adaptive thread scheduling tailored for attention GNNs, addressing GPU training inefficiencies.
Findings
Achieves up to 7.0x speedup over non-fusion DGL library.
Surpasses existing GNN kernel optimization tools like cuGraph and dgNN.
Provides an average 2.16x speedup in end-to-end training over DGL.
Abstract
Attention Graph Neural Networks (AT-GNNs), such as GAT and Graph Transformer, have demonstrated superior performance compared to other GNNs. However, existing GNN systems struggle to efficiently train AT-GNNs on GPUs due to their intricate computation patterns. The execution of AT-GNN operations without kernel fusion results in heavy data movement and significant kernel launch overhead, while fixed thread scheduling in existing GNN kernel fusion strategies leads to sub-optimal performance, redundant computation and unbalanced workload. To address these challenges, we propose a dynamic kernel fusion framework, DF-GNN, for the AT-GNN family. DF-GNN introduces a dynamic bi-level thread scheduling strategy, enabling flexible adjustments to thread scheduling while retaining the benefits of shared memory within the fused kernel. DF-GNN tailors specific thread scheduling for operations in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Brain Tumor Detection and Classification
MethodsDense Connections · Laplacian EigenMap · Label Smoothing · Dropout · Linear Layer · Laplacian Positional Encodings · Layer Normalization · Byte Pair Encoding · Adam · Residual Connection
