AMPLE: Event-Driven Accelerator for Mixed-Precision Inference of Graph   Neural Networks

Pedro Gimenes; Yiren Zhao; George Constantinides

arXiv:2502.21196·cs.AR·March 3, 2025

AMPLE: Event-Driven Accelerator for Mixed-Precision Inference of Graph Neural Networks

Pedro Gimenes, Yiren Zhao, George Constantinides

PDF

TL;DR

AMPLE is an FPGA-based accelerator for GNN inference that uses an event-driven approach and mixed-precision quantization, achieving significant speedups over CPU and GPU implementations.

Contribution

It introduces an event-driven FPGA architecture with mixed-precision quantization and a novel prefetcher to improve GNN inference performance on irregular graph data.

Findings

01

Achieved 243x speedup over CPU

02

Achieved 7.2x speedup over GPU

03

Effective handling of irregular memory access patterns

Abstract

Graph Neural Networks (GNNs) have recently gained attention due to their performance on non-Euclidean data. The use of custom hardware architectures proves particularly beneficial for GNNs due to their irregular memory access patterns, resulting from the sparse structure of graphs. However, existing FPGA accelerators are limited by their double buffering mechanism, which doesn't account for the irregular node distribution in typical graph datasets. To address this, we introduce \textbf{AMPLE} (Accelerated Message Passing Logic Engine), an FPGA accelerator leveraging a new event-driven programming flow. We develop a mixed-arithmetic architecture, enabling GNN inference to be quantized at a node-level granularity. Finally, prefetcher for data and instructions is implemented to optimize off-chip memory access and maximize node parallelism. Evaluation on citation and social media graph…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSoftmax · Attention Is All You Need