Systolic-CNN: An OpenCL-defined Scalable Run-time-flexible FPGA Accelerator Architecture for Accelerating Convolutional Neural Network Inference in Cloud/Edge Computing
Akshay Dua, Yixing Li, Fengbo Ren

TL;DR
Systolic-CNN is a scalable, flexible FPGA accelerator architecture designed for efficient CNN inference in cloud and edge computing, supporting multiple models at run time without reprogramming.
Contribution
It introduces a highly scalable, run-time-flexible FPGA architecture with a pipelined systolic array for improved resource utilization and parallelism in CNN inference.
Findings
Achieves up to 100% DSP utilization on FPGA.
Provides low inference latency for various CNN models.
Supports multiple CNN models at run time without reprogramming.
Abstract
This paper presents Systolic-CNN, an OpenCL-defined scalable, run-time-flexible FPGA accelerator architecture, optimized for accelerating the inference of various convolutional neural networks (CNNs) in multi-tenancy cloud/edge computing. The existing OpenCL-defined FPGA accelerators for CNN inference are insufficient due to limited flexibility for supporting multiple CNN models at run time and poor scalability resulting in underutilized FPGA resources and limited computational parallelism. Systolic-CNN adopts a highly pipelined and paralleled 1-D systolic array architecture, which efficiently explores both spatial and temporal parallelism for accelerating CNN inference on FPGAs. Systolic-CNN is highly scalable and parameterized, which can be easily adapted by users to achieve up to 100% utilization of the coarse-grained computation resources (i.e., DSP blocks) for a given FPGA.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · CCD and CMOS Imaging Sensors · Advanced Memory and Neural Computing
MethodsConvolution · Focal Loss · 1x1 Convolution · Feature Pyramid Network · RetinaNet
