# Minimizing Classification Energy of Binarized Neural Network Inference   for Wearable Devices

**Authors:** Morteza Hosseini, Hirenkumar Paneliya, Uttej Kallakuri, Mohit, Khatwani, and Tinoosh Mohsenin

arXiv: 1903.11381 · 2019-03-28

## TL;DR

This paper presents a scalable low-power hardware platform for binarized neural networks tailored for wearable devices, optimizing energy efficiency through configurable processing engines and memory width, and introducing a Pool-Skipping technique to further reduce energy consumption.

## Contribution

The paper introduces a scalable hardware architecture for BNNs with configurable parameters and a novel Pool-Skipping method, achieving significant energy savings for wearable applications.

## Key findings

- Up to 4x energy reduction on FPGA and 2.5x on ASIC with optimal configurations.
- Wider memories generally improve BNN processing efficiency.
- Pool-Skipping reduces at least 25% of operations, saving 22% energy in stress detection.

## Abstract

In this paper, we propose a low-power hardware for efficient deployment of binarized neural networks (BNNs) that have been trained for physiological datasets. BNNs constrain weights and feature-map to 1 bit, can pack in as many 1-bit weights as the width of a memory entry provides, and can execute multiple multiply-accumulate (MAC) operations with one fused bit-wise xnor and population-count instruction over aligned packed entries. Our proposed hardware is scalable with the number of processing engines (PEs) and the memory width, both of which adjustable for the most energy efficient configuration given an application. We implement two real case studies including Physical Activity Monitoring and Stress Detection on our platform, and for each case study on the target platform, we seek the optimal PE and memory configurations. Our implementation results indicate that a configuration with a good choice of memory width and number of PEs can be optimized up to 4x and 2.5x in energy consumption respectively on Artix-7 FPGA and on 65nm CMOS ASIC implementation. We also show that, generally, wider memories make more efficient BNN processing hardware. To further reduce the energy, we introduce Pool-Skipping technique that can skip at least 25% of the operations that are accompanied by a Max-Pool layer in BNNs, leading to a total of 22% operation reduction in the Stress Detection case study. Compared to the related works using the same case studies on the same target platform and with the same classification accuracy, our hardware is respectively 4.5x and 250x more energy efficient for the Stress Detection on FPGA and Physical Activity Monitoring on ASIC, respectively.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.11381/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/1903.11381/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/1903.11381/full.md

---
Source: https://tomesphere.com/paper/1903.11381