CRAFT: Criticality-Aware Fault-Tolerance Enhancement Techniques for Emerging Memories-Based Deep Neural Networks
Thai-Hoang Nguyen, Muhammad Imran, Jaehyuk Choi, Joon-Sung Yang

TL;DR
This paper introduces CRAFT, a set of techniques that improve the reliability of NVM-based deep neural networks by analyzing and mitigating the impact of stuck-at faults through data remapping and critical-bit encoding.
Contribution
The paper presents novel criticality-aware fault-tolerance methods, including data remapping and encoding, to enhance DNN reliability on emerging NVMs with stuck-at faults.
Findings
Data remapping reduces error impact on DNN accuracy.
Critical-bit analysis identifies key parameters affecting reliability.
Encoding swaps critical bits with non-critical bits to mitigate faults.
Abstract
Deep Neural Networks (DNNs) have emerged as the most effective programming paradigm for computer vision and natural language processing applications. With the rapid development of DNNs, efficient hardware architectures for deploying DNN-based applications on edge devices have been extensively studied. Emerging Non-Volatile Memories (NVMs), with their better scalability, non-volatility and good read performance, are found to be promising candidates for deploying DNNs. However, despite the promise, emerging NVMs often suffer from reliability issues such as stuck-at faults, which decrease the chip yield/memory lifetime and severely impact the accuracy of DNNs. A stuck-at cell can be read but not reprogrammed, thus, stuck-at faults in NVMs may or may not result in errors depending on the data to be stored. By reducing the number of errors caused by stuck-at faults, the reliability of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
