Efficient Error-Tolerant Quantized Neural Network Accelerators

Giulio Gambardella; Johannes Kappauf; Michaela Blott; Christoph; Doehring; Martin Kumm; Peter Zipf; Kees Vissers

arXiv:1912.07394·eess.SP·December 17, 2019

Efficient Error-Tolerant Quantized Neural Network Accelerators

Giulio Gambardella, Johannes Kappauf, Michaela Blott, Christoph, Doehring, Martin Kumm, Peter Zipf, Kees Vissers

PDF

TL;DR

This paper evaluates the fault tolerance of quantized neural networks in hardware accelerators, revealing their vulnerability and proposing methods to improve robustness with minimal redundancy.

Contribution

It introduces a methodology using FPGA-based error injection to assess fault impact on QNNs and proposes two novel fault mitigation techniques.

Findings

01

QNNs with convolutional layers are less fault-tolerant than previously believed.

02

Faults can cause accuracy drops of up to 10%.

03

Proposed methods improve robustness with less redundancy.

Abstract

Neural Networks are currently one of the most widely deployed machine learning algorithms. In particular, Convolutional Neural Networks (CNNs), are gaining popularity and are evaluated for deployment in safety critical applications such as self driving vehicles. Modern CNNs feature enormous memory bandwidth and high computational needs, challenging existing hardware platforms to meet throughput, latency and power requirements. Functional safety and error tolerance need to be considered as additional requirement in safety critical systems. In general, fault tolerant operation can be achieved by adding redundancy to the system, which is further exacerbating the computational demands. Furthermore, the question arises whether pruning and quantization methods for performance scaling turn out to be counterproductive with regards to fail safety requirements. In this work we present a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning