Online Soft Error Tolerance in ReRAM Crossbars for Deep Learning Accelerators
Benyamin Khezeli, Hamid Reza Zarandi, Elham Cheshmikhani

TL;DR
This paper introduces an online soft error detection and correction method for ReRAM crossbar arrays used in deep learning accelerators, improving accuracy with low overhead despite fabrication and runtime errors.
Contribution
It proposes a novel online error correction technique using test vectors and ECCs specifically tailored for ReRAM-based PIM accelerators.
Findings
Achieves near fault-free accuracy on MNIST and CIFAR-10 datasets.
Low area overhead and power consumption compared to existing methods.
Effective correction of faulty columns in ReRAM crossbars.
Abstract
Resistive Random-Access Memory (ReRAM) crossbar arrays are promising candidates for in-situ matrix-vector multiplication (MVM), a frequent operation in Deep Learning algorithms. Despite their advantages, these emerging non-volatile memories are susceptible to errors due to non-idealities such as immature fabrication processes and runtime errors, which lead to accuracy degradation in Processing-in-Memory (PIM) accelerators. This paper proposes an online soft error detection and correction method in ReRAM crossbar arrays. We utilize a test input vector and Error Correcting Codes (ECCs) to detect and correct faulty columns. The proposed approach demonstrates near fault-free accuracy for Neural Networks (NNs) on MNIST and CIFAR-10 datasets, with low area overhead and power consumption compared to recent methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiation Effects in Electronics · Advanced Memory and Neural Computing · Advanced Neural Network Applications
