A High-Throughput FPGA Accelerator for Lightweight CNNs With Balanced   Dataflow

Zhiyuan Zhao; Yihao Chen; Pengcheng Feng; Jixing Li; Gang Chen,; Rongxuan Shen; Huaxiang Lu

arXiv:2407.19449·cs.AR·December 17, 2024·1 cites

A High-Throughput FPGA Accelerator for Lightweight CNNs With Balanced Dataflow

Zhiyuan Zhao, Yihao Chen, Pengcheng Feng, Jixing Li, Gang Chen,, Rongxuan Shen, Huaxiang Lu

PDF

Open Access

TL;DR

This paper introduces a multi-Computing-Engine FPGA accelerator with balanced dataflow for lightweight CNNs, significantly reducing memory overhead and boosting computational efficiency.

Contribution

It proposes a novel multi-CE architecture with balanced dataflow and resource-aware optimization, improving performance and scalability for lightweight CNN acceleration.

Findings

01

Reduces on-chip memory by up to 68.3%

02

Achieves up to 2092.4 FPS performance

03

Attains MAC efficiency of 94.58%

Abstract

FPGA accelerators for lightweight neural convolutional networks (LWCNNs) have recently attracted significant attention. Most existing LWCNN accelerators focus on single-Computing-Engine (CE) architecture with local optimization. However, these designs typically suffer from high on-chip/off-chip memory overhead and low computational efficiency due to their layer-by-layer dataflow and unified resource mapping mechanisms. To tackle these issues, a novel multi-CE-based accelerator with balanced dataflow is proposed to efficiently accelerate LWCNN through memory-oriented and computing-oriented optimizations. Firstly, a streaming architecture with hybrid CEs is designed to minimize off-chip memory access while maintaining a low cost of on-chip buffer size. Secondly, a balanced dataflow strategy is introduced for streaming architectures to enhance computational efficiency by improving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCCD and CMOS Imaging Sensors