Software-Defined FPGA Accelerator Design for Mobile Deep Learning   Applications

Panagiotis G. Mousouliotis; Loukas P. Petrou

arXiv:1902.03192·cs.CV·March 26, 2019·1 cites

Software-Defined FPGA Accelerator Design for Mobile Deep Learning Applications

Panagiotis G. Mousouliotis, Loukas P. Petrou

PDF

Open Access

TL;DR

This paper introduces a workflow for designing FPGA-based accelerators for mobile deep learning applications, simplifying development with high-level synthesis tools and providing performance estimation models.

Contribution

It presents a novel workflow and an HLS-driven analytical model for efficient FPGA accelerator design targeting mobile deep learning tasks.

Findings

01

Accelerator design for low-power FPGA devices is feasible with the proposed workflow.

02

The analytical model effectively estimates performance and guides design improvements.

03

The approach enables faster development of mobile-friendly CNN accelerators.

Abstract

Recently, the field of deep learning has received great attention by the scientific community and it is used to provide improved solutions to many computer vision problems. Convolutional neural networks (CNNs) have been successfully used to attack problems such as object recognition, object detection, semantic segmentation, and scene understanding. The rapid development of deep learning goes hand by hand with the adaptation of GPUs for accelerating its processes, such as network training and inference. Even though FPGA design exists long before the use of GPUs for accelerating computations and despite the fact that high-level synthesis (HLS) tools are getting more attractive, the adaptation of FPGAs for deep learning research and application development is poor due to the requirement of hardware design related expertise. This work presents a workflow for deep learning mobile application…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCCD and CMOS Imaging Sensors · Advanced Memory and Neural Computing · Advanced Neural Network Applications

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Residual Connection · Convolution · Average Pooling · Fire Module · Global Average Pooling · 1x1 Convolution · Dropout · Xavier Initialization · Max Pooling