Accelerating CNN inference on FPGAs: A Survey

Kamel Abdelouahab; Maxime Pelcat; Jocelyn Serot; Fran\c{c}ois; Berry

arXiv:1806.01683·cs.DC·June 6, 2018·103 cites

Accelerating CNN inference on FPGAs: A Survey

Kamel Abdelouahab, Maxime Pelcat, Jocelyn Serot, Fran\c{c}ois, Berry

PDF

Open Access

TL;DR

This survey reviews recent FPGA-based CNN inference accelerators, analyzing their architectures, optimizations, and performance, highlighting trends and future directions in hardware deep learning acceleration.

Contribution

It provides a comprehensive overview of recent FPGA CNN acceleration methods, comparing various optimization techniques and identifying key research trends.

Findings

01

FPGAs are well-suited for CNN workloads due to their reconfigurability.

02

Recent methods improve performance through convolutional and memory optimizations.

03

State-of-the-art approaches demonstrate significant speedups in CNN inference on FPGAs.

Abstract

Convolutional Neural Networks (CNNs) are currently adopted to solve an ever greater number of problems, ranging from speech recognition to image classification and segmentation. The large amount of processing required by CNNs calls for dedicated and tailored hardware support methods. Moreover, CNN workloads have a streaming nature, well suited to reconfigurable hardware architectures such as FPGAs. The amount and diversity of research on the subject of CNN FPGA acceleration within the last 3 years demonstrates the tremendous industrial and academic interest. This paper presents a state-of-the-art of CNN inference accelerators over FPGAs. The computational workloads, their parallelism and the involved memory accesses are analyzed. At the level of neurons, optimizations of the convolutional and fully connected layers are explained and the performances of the different methods compared. At…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · Machine Learning and ELM