Performance evaluation over HW/SW co-design SoC memory transfers for a   CNN accelerator

A. Rios-Navarro; R. Tapiador-Morales; A. Jimenez-Fernandez; M.; Dominguez-Morales; C. Amaya; A. Linares-Barranco

arXiv:1806.01106·cs.DC·June 5, 2018·1 cites

Performance evaluation over HW/SW co-design SoC memory transfers for a CNN accelerator

A. Rios-Navarro, R. Tapiador-Morales, A. Jimenez-Fernandez, M., Dominguez-Morales, C. Amaya, A. Linares-Barranco

PDF

Open Access

TL;DR

This paper evaluates the performance of data transfers between processing system and programmable logic in a Xilinx Zynq FPGA for CNN acceleration, comparing polling and interrupt-based transfer management methods.

Contribution

It introduces and compares data partitioning and transfer management techniques for optimizing PS-PL data throughput in CNN FPGA accelerators.

Findings

01

Kernel-level driver improves timing for longer packets.

02

Balanced data transfer enhances CNN processing efficiency.

03

Interrupt-based management offers safer and more effective control.

Abstract

Many FPGAs vendors have recently included embedded processors in their devices, like Xilinx with ARM-Cortex A cores, together with programmable logic cells. These devices are known as Programmable System on Chip (PSoC). Their ARM cores (embedded in the processing system or PS) communicates with the programmable logic cells (PL) using ARM-standard AXI buses. In this paper we analyses the performance of exhaustive data transfers between PS and PL for a Xilinx Zynq FPGA in a co-design real scenario for Convolutional Neural Networks (CNN) accelerator, which processes, in dedicated hardware, a stream of visual information from a neuromorphic visual sensor for classification. In the PS side, a Linux operating system is running, which recollects visual events from the neuromorphic sensor into a normalized frame, and then it transfers these frames to the accelerator of multi-layered CNNs, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCCD and CMOS Imaging Sensors · Advanced Memory and Neural Computing · Neuroscience and Neural Engineering