Hardware Acceleration of Sampling Algorithms in Sample and Aggregate Graph Neural Networks
Yuchen Gui, Boyi Wei, Wei Yuan, Xi Jin

TL;DR
This paper introduces a new neighbor sampling algorithm and an FPGA-based accelerator that significantly speeds up the sampling process in Graph Neural Networks, enabling faster training without sacrificing accuracy.
Contribution
The paper presents a novel CONCAT Sampler for efficient hardware acceleration and an FPGA-based accelerator achieving 300-1000x speedup in sampling.
Findings
CONCAT Sampler maintains test accuracy while being hardware-friendly.
FPGA accelerator achieves 300-1000x faster sampling.
Sampling process becomes a bottleneck reduction in GNN training.
Abstract
Sampling is an important process in many GNN structures in order to train larger datasets with a smaller computational complexity. However, compared to other processes in GNN (such as aggregate, backward propagation), the sampling process still costs tremendous time, which limits the speed of training. To reduce the time of sampling, hardware acceleration is an ideal choice. However, state of the art GNN acceleration proposal did not specify how to accelerate the sampling process. What's more, directly accelerating traditional sampling algorithms will make the structure of the accelerator very complicated. In this work, we made two contributions: (1) Proposed a new neighbor sampler: CONCAT Sampler, which can be easily accelerated on hardware level while guaranteeing the test accuracy. (2) Designed a CONCAT-sampler-accelerator based on FPGA, with which the neighbor sampling process…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBrain Tumor Detection and Classification · Advanced Graph Neural Networks · Neural Networks and Applications
MethodsTest · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
