Artificial neural networks for online error detection
Vassilis Vassiliadis, Konstantinos Parasyris, Christos D., Antonopoulos, Spyros Lalis, Nikolaos Bellas

TL;DR
This paper introduces a low-cost, online error detection method using artificial neural networks trained with fault injection, effectively identifying silent data corruptions with minimal performance overhead and high fault detection accuracy.
Contribution
It demonstrates the effectiveness of ANNs for online error detection in hardware, outperforming existing methods like Topaz in detection coverage and efficiency.
Findings
ANNs detect 94.85% of faults with 6.45% CPU overhead
ANN-based detection outperforms Topaz in accuracy and performance
Overclocked ARM CPUs used for validation show practical applicability
Abstract
Hardware reliability is adversely affected by the downscaling of semiconductor devices and the scale-out of systems necessitated by modern applications. Apart from crashes, this unreliability often manifests as silent data corruptions (SDCs), affecting application output. Therefore, we need low-cost and low-human-effort solutions to reduce the incidence rate and the effects of SDCs on the quality of application outputs. We propose Artificial Neural Networks (ANNs) as an effective mechanism for online error detection. We train ANNs using software fault injection. We find that the average overhead of our approach, followed by a costly error correction by re-execution, is 6.45% in terms of CPU cycles. We also report that ANNs discover 94.85% of faults thereby resulting in minimal output quality degradation. To validate our approach we overclock ARM Cortex A53 CPUs, execute benchmarks on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiation Effects in Electronics · Semiconductor materials and devices · Advancements in Semiconductor Devices and Circuit Design
