eCNN: A Block-Based and Highly-Parallel CNN Accelerator for Edge Inference
Chao-Tsung Huang, Yu-Chun Ding, Huan-Ching Wang, Chi-Wen Weng,, Kai-Ping Lin, Li-Wei Wang, Li-De Chen

TL;DR
This paper introduces eCNN, a highly-parallel, block-based CNN accelerator optimized for edge inference, capable of supporting ultra-high-resolution videos efficiently with reduced power and memory usage.
Contribution
It proposes a novel block-based inference flow, a hardware-oriented network model ERNet, and a coarse-grained instruction set FBISA, integrated into an embedded processor eCNN for efficient edge CNN inference.
Findings
Supports 4K Ultra-HD 30 fps with low power consumption
Eliminates DRAM bandwidth for feature maps using block-based inference
Outperforms state-of-the-art in power efficiency and resolution support
Abstract
Convolutional neural networks (CNNs) have recently demonstrated superior quality for computational imaging applications. Therefore, they have great potential to revolutionize the image pipelines on cameras and displays. However, it is difficult for conventional CNN accelerators to support ultra-high-resolution videos at the edge due to their considerable DRAM bandwidth and power consumption. Therefore, finding a further memory- and computation-efficient microarchitecture is crucial to speed up this coming revolution. In this paper, we approach this goal by considering the inference flow, network model, instruction set, and processor design jointly to optimize hardware performance and image quality. We apply a block-based inference flow which can eliminate all the DRAM bandwidth for feature maps and accordingly propose a hardware-oriented network model, ERNet, to optimize image quality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Convolution
