A flexible FPGA accelerator for convolutional neural networks
Kingshuk Majumder, Shubham Nema, Uday Bondhugula

TL;DR
This paper presents a flexible FPGA-based CNN inference accelerator that minimizes off-chip memory access through various reuse strategies, supports high resource utilization without reconfiguration, and integrates with TensorFlow for ease of programming.
Contribution
The paper introduces a novel FPGA CNN accelerator design that exploits multiple reuse strategies, maintains high utilization without reconfiguration, and provides a TensorFlow-compatible software framework.
Findings
Achieves high frequency scaling with increased PEs.
Maintains a significant fraction of theoretical peak performance.
Effectively reduces off-chip memory access in CNN inference.
Abstract
Though CNNs are highly parallel workloads, in the absence of efficient on-chip memory reuse techniques, an accelerator for them quickly becomes memory bound. In this paper, we propose a CNN accelerator design for inference that is able to exploit all forms of reuse available to minimize off-chip memory access while increasing utilization of available resources. The proposed design is composed of cores, each of which contains a one-dimensional array of processing elements. These cores can exploit different types of reuse available in CNN layers of varying shapes without requiring any reconfiguration; in particular, our design minimizes underutilization due to problem sizes that are not perfect multiples of the underlying hardware array dimensions. A major obstacle in the adoption of FPGAs as a platform for CNN inference is the difficulty to program these devices using hardware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCCD and CMOS Imaging Sensors · Neural Networks and Applications · Advanced Neural Network Applications
