Synergy: A HW/SW Framework for High Throughput CNNs on Embedded   Heterogeneous SoC

Guanwen Zhong; Akshat Dubey; Tan Cheng; Tulika Mitra

arXiv:1804.00706·cs.DC·March 7, 2019

Synergy: A HW/SW Framework for High Throughput CNNs on Embedded Heterogeneous SoC

Guanwen Zhong, Akshat Dubey, Tan Cheng, Tulika Mitra

PDF

TL;DR

Synergy is a hardware-software co-designed framework that enables high-throughput, energy-efficient CNN inference on embedded heterogeneous SoCs by leveraging multi-threading and adaptive workload balancing across FPGA and NEON accelerators.

Contribution

It introduces a unified, adaptable framework for CNN inference on embedded SoCs that efficiently utilizes all on-chip resources without hardware modifications.

Findings

01

Achieves 7.3X speedup over software-only solutions

02

Demonstrates superior throughput and energy efficiency

03

Supports runtime adaptation to different CNN configurations

Abstract

Convolutional Neural Networks (CNN) have been widely deployed in diverse application domains. There has been significant progress in accelerating both their training and inference using high-performance GPUs, FPGAs, and custom ASICs for datacenter-scale environments. The recent proliferation of mobile and IoT devices have necessitated real-time, energy-efficient deep neural network inference on embedded-class, resource-constrained platforms. In this context, we present {\em Synergy}, an automated, hardware-software co-designed, pipelined, high-throughput CNN inference framework on embedded heterogeneous system-on-chip (SoC) architectures (Xilinx Zynq). {\em Synergy} leverages, through multi-threading, all the available on-chip resources, which includes the dual-core ARM processor along with the FPGA and the NEON SIMD engines as accelerators. Moreover, {\em Synergy} provides a unified…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.