On the Resilience of Deep Learning for Reduced-voltage FPGAs
Kamyar Givaki, Behzad Salami, Reza Hojabr, S. M. Reza Tayaranian,, Ahmad Khonsari, Dara Rahmati, Saeid Gorgin, Adrian Cristal, Osman S. Unsal

TL;DR
This paper investigates the resilience of FPGA-based deep neural network training under aggressive voltage underscaling, demonstrating that modern FPGAs can tolerate low-voltage faults with minimal impact on accuracy, reducing the need for additional fault mitigation.
Contribution
It provides an experimental analysis of FPGA resilience during DNN training under voltage underscaling, highlighting the robustness and fault masking capabilities of modern FPGAs.
Findings
Modern FPGAs are robust at extremely low-voltage levels.
Low-voltage faults can be masked within training iterations, avoiding extra mitigation.
Increased fault rates degrade accuracy, with Tanh outperforming Relu at high fault levels.
Abstract
Deep Neural Networks (DNNs) are inherently computation-intensive and also power-hungry. Hardware accelerators such as Field Programmable Gate Arrays (FPGAs) are a promising solution that can satisfy these requirements for both embedded and High-Performance Computing (HPC) systems. In FPGAs, as well as CPUs and GPUs, aggressive voltage scaling below the nominal level is an effective technique for power dissipation minimization. Unfortunately, bit-flip faults start to appear as the voltage is scaled down closer to the transistor threshold due to timing issues, thus creating a resilience issue. This paper experimentally evaluates the resilience of the training phase of DNNs in the presence of voltage underscaling related faults of FPGAs, especially in on-chip memories. Toward this goal, we have experimentally evaluated the resilience of LeNet-5 and also a specially designed network for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTanh Activation · *Communicated@Fast*How Do I Communicate to Expedia?
