PhD Thesis Summary: Methods for Reliability Assessment and Enhancement of Deep Neural Network Hardware Accelerators
Mahdi Taheri

TL;DR
This thesis introduces innovative, cost-effective methods for assessing and improving the reliability of deep neural network hardware accelerators, balancing fault tolerance with computational efficiency.
Contribution
It presents new analytical tools, explores reliability-quantization trade-offs, and develops AdAM, a real-time, low-overhead reliability enhancement technique.
Findings
AdAM achieves fault tolerance comparable to traditional methods
New analytical tools improve reliability assessment accuracy
Reliability-quantization trade-offs optimize hardware efficiency
Abstract
This manuscript summarizes the work and showcases the impact of the doctoral thesis by introducing novel, cost-efficient methods for assessing and enhancing the reliability of DNN hardware accelerators. A comprehensive Systematic Literature Review (SLR) was conducted, categorizing existing reliability assessment techniques, identifying research gaps, and leading to the development of new analytical reliability assessment tools. Additionally, this work explores the interplay between reliability, quantization, and approximation, proposing methodologies that optimize the trade-offs between computational efficiency and fault tolerance. Furthermore, a real-time, zero-overhead reliability enhancement technique, AdAM, was developed, providing fault tolerance comparable to traditional redundancy methods while significantly reducing hardware costs. The impact of this research extends beyond…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiation Effects in Electronics · Software Reliability and Analysis Research · Software-Defined Networks and 5G
