Design of High-Throughput Mixed-Precision CNN Accelerators on FPGA
Cecilia Latotzke, Tim Ciesielski, and Tobias Gemmeke

TL;DR
This paper presents a comprehensive methodology for designing high-throughput, mixed-precision CNN accelerators on FPGA, optimizing accuracy and efficiency through layer-wise quantization and resource-aware exploration.
Contribution
It introduces a holistic, multi-level exploration approach for FPGA-based mixed-precision CNN accelerators, enabling efficient layer-wise and channel-wise quantization with significant resource savings.
Findings
Achieves 245 fps with 87.48% Top-5 accuracy on ResNet-18
Reaches 92.9% Top-5 accuracy at 1.13 TOPs/s on ResNet-152
Reduces parameter memory footprint by up to 9.4x
Abstract
Convolutional Neural Networks (CNNs) reach high accuracies in various application domains, but require large amounts of computation and incur costly data movements. One method to decrease these costs while trading accuracy is weight and/or activation word-length reduction. Thereby, layer-wise mixed-precision quantization allows for more efficient results while inflating the design space. In this work, we present an in-depth quantitative methodology to efficiently explore the design space considering the limited hardware resources of a given FPGA. Our holistic exploration approach vertically traverses the various design entry levels from the architectural down to the logic level, and laterally covers optimization from processing elements to dataflow for an efficient mixed-precision CNN accelerator. Our resulting hardware accelerators implement truly mixed-precision operations that enable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Neural Networks and Applications · Image Processing Techniques and Applications
