Mitigating multiple single-event upsets during deep neural network   inference using fault-aware training

Toon Vinck; Na\"in Jonckers; Gert Dekkers; Jeffrey Prinzie; Peter; Karsmakers

arXiv:2502.09374·cs.LG·February 14, 2025

Mitigating multiple single-event upsets during deep neural network inference using fault-aware training

Toon Vinck, Na\"in Jonckers, Gert Dekkers, Jeffrey Prinzie, Peter, Karsmakers

PDF

Open Access

TL;DR

This paper introduces a fault-aware training method to enhance the robustness of deep neural networks against multiple single-event upsets, crucial for safety-critical applications in high-radiation environments.

Contribution

It proposes a novel fault-aware training approach that significantly improves DNN fault tolerance without hardware changes.

Findings

01

FAT increases fault tolerance up to 3 times

02

Fault injection analysis highlights impact of multiple upsets

03

Method applicable to safety-critical DNN deployments

Abstract

Deep neural networks (DNNs) are increasingly used in safety-critical applications. Reliable fault analysis and mitigation are essential to ensure their functionality in harsh environments that contain high radiation levels. This study analyses the impact of multiple single-bit single-event upsets in DNNs by performing fault injection at the level of a DNN model. Additionally, a fault aware training (FAT) methodology is proposed that improves the DNNs' robustness to faults without any modification to the hardware. Experimental results show that the FAT methodology improves the tolerance to faults up to a factor 3.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRadiation Effects in Electronics · Adversarial Robustness in Machine Learning · Integrated Circuits and Semiconductor Failure Analysis

MethodsAttentive Walk-Aggregating Graph Neural Network