Using Convolutional Neural Networks for fault analysis and alleviation in accelerator systems
Jashanpreet Singh Sraw, Deepak M C

TL;DR
This paper explores using convolutional neural networks to analyze and mitigate hardware failures in accelerator systems, aiming to improve reliability with minimal additional hardware, especially for critical applications.
Contribution
It introduces a novel method leveraging CNNs for fault analysis and prevention in accelerators, addressing hardware failure challenges with minimal overhead.
Findings
Identified systemic causes of hardware failures in accelerators.
Proposed an efficient CNN-based method to prevent failures.
Demonstrated improved system reliability with minimal hardware impact.
Abstract
Today, Neural Networks are the basis of breakthroughs in virtually every technical domain. Their application to accelerators has recently resulted in better performance and efficiency in these systems. At the same time, the increasing hardware failures due to the latest (shrinked) semiconductor technology needs to be addressed. Since accelerator systems are often used to back time-critical applications such as self-driving cars or medical diagnosis applications, these hardware failures must be eliminated. Our research evaluates these failures from a systemic point of view. Based on our results, we find critical results for the system reliability enhancement and we further put forth an efficient method to avoid these failures with minimal hardware overhead.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection · Advanced Neural Network Applications · Fault Detection and Control Systems
