Yield Loss Reduction and Test of AI and Deep Learning Accelerators
Mehdi Sadi, Ujjwal Guin

TL;DR
This paper presents a novel methodology for reducing yield loss in AI and deep learning accelerators by leveraging their fault tolerance, enabling high fault rates with minimal accuracy impact through selective deactivation of faulty processing elements.
Contribution
It introduces an application-driven binning and testing approach that correlates circuit faults with AI workload accuracy, improving yield without sacrificing performance.
Findings
Accelerators can tolerate up to 5% fault rate with less than 1% accuracy loss.
The methodology enables effective product-binning based on fault tolerance.
A new fault isolation and test flow for PEs enhances yield management.
Abstract
With data-driven analytics becoming mainstream, the global demand for dedicated AI and Deep Learning accelerator chips is soaring. These accelerators, designed with densely packed Processing Elements (PE), are especially vulnerable to the manufacturing defects and functional faults common in the advanced semiconductor process nodes resulting in significant yield loss. In this work, we demonstrate an application-driven methodology of binning the AI accelerator chips, and yield loss reduction by correlating the circuit faults in the PEs of the accelerator with the desired accuracy of the target AI workload. We exploit the inherent fault tolerance features of trained deep learning models and a strategy of selective deactivation of faulty PEs to develop the presented yield loss reduction and test methodology. An analytical relationship is derived between fault location, fault rate, and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Industrial Vision Systems and Defect Detection · Radiation Effects in Electronics
