Mitigating multiple single-event upsets during deep neural network inference using fault-aware training
Toon Vinck, Na\"in Jonckers, Gert Dekkers, Jeffrey Prinzie, Peter, Karsmakers

TL;DR
This paper introduces a fault-aware training method to enhance the robustness of deep neural networks against multiple single-event upsets, crucial for safety-critical applications in high-radiation environments.
Contribution
It proposes a novel fault-aware training approach that significantly improves DNN fault tolerance without hardware changes.
Findings
FAT increases fault tolerance up to 3 times
Fault injection analysis highlights impact of multiple upsets
Method applicable to safety-critical DNN deployments
Abstract
Deep neural networks (DNNs) are increasingly used in safety-critical applications. Reliable fault analysis and mitigation are essential to ensure their functionality in harsh environments that contain high radiation levels. This study analyses the impact of multiple single-bit single-event upsets in DNNs by performing fault injection at the level of a DNN model. Additionally, a fault aware training (FAT) methodology is proposed that improves the DNNs' robustness to faults without any modification to the hardware. Experimental results show that the FAT methodology improves the tolerance to faults up to a factor 3.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiation Effects in Electronics · Adversarial Robustness in Machine Learning · Integrated Circuits and Semiconductor Failure Analysis
MethodsAttentive Walk-Aggregating Graph Neural Network
