Efficient Edge AI: Deploying Convolutional Neural Networks on FPGA with   the Gemmini Accelerator

Federico Nicolas Peccia; Svetlana Pavlitska; Tobias Fleck; Oliver; Bringmann

arXiv:2408.07404·cs.AR·August 15, 2024

Efficient Edge AI: Deploying Convolutional Neural Networks on FPGA with the Gemmini Accelerator

Federico Nicolas Peccia, Svetlana Pavlitska, Tobias Fleck, Oliver, Bringmann

PDF

Open Access

TL;DR

This paper presents an end-to-end workflow for deploying CNNs on FPGAs using a modified Gemmini accelerator, achieving real-time performance and high energy efficiency for edge AI applications.

Contribution

It introduces a customized FPGA deployment process for CNNs with open source tools, demonstrating improved performance and energy efficiency over existing solutions.

Findings

01

Achieved real-time YOLOv7 deployment on FPGA

02

Energy efficiency of 36.5 GOP/s/W

03

Outperforms other FPGA implementations

Abstract

The growing concerns regarding energy consumption and privacy have prompted the development of AI solutions deployable on the edge, circumventing the substantial CO2 emissions associated with cloud servers and mitigating risks related to sharing sensitive data. But deploying Convolutional Neural Networks (CNNs) on non-off-the-shelf edge devices remains a complex and labor-intensive task. In this paper, we present and end-to-end workflow for deployment of CNNs on Field Programmable Gate Arrays (FPGAs) using the Gemmini accelerator, which we modified for efficient implementation on FPGAs. We describe how we leverage the use of open source software on each optimization step of the deployment process, the customizations we added to them and its impact on the final system's performance. We were able to achieve real-time performance by deploying a YOLOv7 model on a Xilinx ZCU102 FPGA with an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCCD and CMOS Imaging Sensors · Advanced Neural Network Applications · Embedded Systems Design Techniques