Model-Architecture Co-Design for High Performance Temporal GNN Inference   on FPGA

Hongkuan Zhou; Bingyi Zhang; Rajgopal Kannan; Viktor Prasanna; Carl; Busart

arXiv:2203.05095·cs.AR·March 11, 2022

Model-Architecture Co-Design for High Performance Temporal GNN Inference on FPGA

Hongkuan Zhou, Bingyi Zhang, Rajgopal Kannan, Viktor Prasanna, Carl, Busart

PDF

Open Access 1 Repo

TL;DR

This paper introduces a co-designed FPGA hardware and simplified model architecture for high-performance inference of temporal graph neural networks, achieving efficiency through optimized attention computation, neighbor pruning, and hardware techniques.

Contribution

It presents a novel combined model-architecture design for efficient TGNN inference on FPGAs, including lightweight attention, neighbor pruning, and hardware optimizations.

Findings

01

Achieved high throughput on real-world datasets.

02

Reduced computation and memory access through pruning and hardware design.

03

Maintained accuracy with knowledge distillation.

Abstract

Temporal Graph Neural Networks (TGNNs) are powerful models to capture temporal, structural, and contextual information on temporal graphs. The generated temporal node embeddings outperform other methods in many downstream tasks. Real-world applications require high performance inference on real-time streaming dynamic graphs. However, these models usually rely on complex attention mechanisms to capture relationships between temporal neighbors. In addition, maintaining vertex memory suffers from intrinsic temporal data dependency that hinders task-level parallelism, making it inefficient on general-purpose processors. In this work, we present a novel model-architecture co-design for inference in memory-based TGNNs on FPGAs. The key modeling optimizations we propose include a light-weight method to compute attention scores and a related temporal neighbor pruning strategy to further reduce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zjjzby/tgnn-fpga-ipdps2022
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Data Quality and Management

MethodsPruning · Knowledge Distillation