Semi-Streaming Architecture: A New Design Paradigm for CNN   Implementation on FPGAs

Nazariy K. Shaydyuk; Eugene B. John

arXiv:2006.08759·eess.IV·June 17, 2020·1 cites

Semi-Streaming Architecture: A New Design Paradigm for CNN Implementation on FPGAs

Nazariy K. Shaydyuk, Eugene B. John

PDF

Open Access

TL;DR

This paper introduces a semi-streaming FPGA architecture for CNNs that combines layer-specific processing engines to improve efficiency and flexibility, demonstrated with an 8-bit MobileNetV2 implementation.

Contribution

It proposes a novel semi-streaming design paradigm that integrates specialized engines for different CNN layers, enhancing resource utilization and performance.

Findings

01

Achieved up to 89.6 GOp/s throughput for certain layers.

02

Energy efficiency of 5.32 GOp/s/W at 100MHz.

03

Implemented a flexible, layer-specific FPGA CNN accelerator.

Abstract

The recent research advances in deep learning have led to the development of small and powerful Convolutional Neural Network (CNN) architectures. Meanwhile Field Programmable Gate Arrays (FPGAs) has become a popular hardware target choice for their deployment, splitting into two main implementation categories: streaming hardware architectures and single computation engine design approaches. The streaming hardware architectures generally require implementing every layer as a discrete processing unit, and are suitable for smaller software models that could fit in their unfolded versions into resource-constrained targets. On the other hand, single computation engines can be scaled to fit into a device to execute CNN models of different sizes and complexities, however, the achievable performance of one-size-fits-all implementations may vary across CNNs with different workload attributes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · CCD and CMOS Imaging Sensors