BinArray: A Scalable Hardware Accelerator for Binary Approximated CNNs
Mario Fischer, Juergen Wassner (Department of Engineering and, Architecture, Lucerne University of Applied Sciences, Arts, Switzerland)

TL;DR
BinArray is a scalable FPGA-based hardware accelerator for binary-approximated CNNs that reduces computational costs and allows dynamic trade-offs between accuracy and throughput, enabling efficient real-time inference on low-power devices.
Contribution
This paper introduces BinArray, a novel hardware accelerator for binary CNNs that offers scalable performance and dynamic accuracy-throughput trade-offs, optimized for FPGA implementation.
Findings
Operates at 400 MHz on FPGA-SoC platform.
Scales to match performance of EdgeTPU for various networks.
Uses only 50% of device resources for MobileNet inference.
Abstract
Deep Convolutional Neural Networks (CNNs) have become state-of-the art for computer vision and other signal processing tasks due to their superior accuracy. In recent years, large efforts have been made to reduce the computational costs of CNNs in order to achieve real-time operation on low-power embedded devices. Towards this goal we present BinArray, a custom hardware accelerator for CNNs with binary approximated weights. The binary approximation used in this paper is an improved version of a network compression technique initially suggested in [1]. It drastically reduces the number of multiplications required per inference with no or very little accuracy degradation. BinArray easily scales and allows to compromise between hardware resource usage and throughput by means of three design parameters transparent to the user. Furthermore, it is possible to select between high accuracy or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
