# Accelerating Deterministic and Stochastic Binarized Neural Networks on   FPGAs Using OpenCL

**Authors:** Corey Lammie, Wei Xiang, and Mostafa Rahimi Azghadi

arXiv: 1905.06105 · 2021-02-18

## TL;DR

This paper presents FPGA-accelerated deterministic and stochastic binarized neural networks using OpenCL, achieving significant power savings and faster inference times compared to GPU implementations on MNIST and CIFAR-10 datasets.

## Contribution

First FPGA-accelerated stochastic binarized neural networks introduced, with comprehensive benchmarking against GPU counterparts demonstrating efficiency gains.

## Key findings

- Over 16-fold power consumption reduction compared to GPUs.
- Inference times reduced by over 9.9x on MNIST and CIFAR-10.
- Near state-of-the-art performance achieved with binarized networks.

## Abstract

Recent technological advances have proliferated the available computing power, memory, and speed of modern Central Processing Units (CPUs), Graphics Processing Units (GPUs), and Field Programmable Gate Arrays (FPGAs). Consequently, the performance and complexity of Artificial Neural Networks (ANNs) is burgeoning. While GPU accelerated Deep Neural Networks (DNNs) currently offer state-of-the-art performance, they consume large amounts of power. Training such networks on CPUs is inefficient, as data throughput and parallel computation is limited. FPGAs are considered a suitable candidate for performance critical, low power systems, e.g. the Internet of Things (IOT) edge devices. Using the Xilinx SDAccel or Intel FPGA SDK for OpenCL development environment, networks described using the high-level OpenCL framework can be accelerated on heterogeneous platforms. Moreover, the resource utilization and power consumption of DNNs can be further enhanced by utilizing regularization techniques that binarize network weights. In this paper, we introduce, to the best of our knowledge, the first FPGA-accelerated stochastically binarized DNN implementations, and compare them to implementations accelerated using both GPUs and FPGAs. Our developed networks are trained and benchmarked using the popular MNIST and CIFAR-10 datasets, and achieve near state-of-the-art performance, while offering a >16-fold improvement in power consumption, compared to conventional GPU-accelerated networks. Both our FPGA-accelerated determinsitic and stochastic BNNs reduce inference times on MNIST and CIFAR-10 by >9.89x and >9.91x, respectively.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.06105/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1905.06105/full.md

## References

12 references — full list in the complete paper: https://tomesphere.com/paper/1905.06105/full.md

---
Source: https://tomesphere.com/paper/1905.06105