CLEAR: Cross-Layer Exploration for Architecting Resilience - Combining Hardware and Software Techniques to Tolerate Soft Errors in Processor Cores
Eric Cheng, Shahrzad Mirkhani, Lukasz G. Szafaryn, Chen-Yong Cher,, Hyungmin Cho, Kevin Skadron, Mircea R. Stan, Klas Lilja, Jacob A. Abraham,, Pradip Bose, Subhasish Mitra

TL;DR
This paper introduces a comprehensive framework that systematically explores and optimizes cross-layer resilience techniques to enhance soft error tolerance in processor cores, balancing cost and effectiveness.
Contribution
It presents a novel framework for automatic exploration of cross-layer resilience techniques, achieving cost-effective solutions for soft error tolerance in processor designs.
Findings
Combines circuit, logic, architecture, software, and algorithm techniques for resilience.
Achieves 50x silent data corruption improvement with minimal energy overhead.
Demonstrates effectiveness on both in-order and out-of-order processor cores.
Abstract
We present a first of its kind framework which overcomes a major challenge in the design of digital systems that are resilient to reliability failures: achieve desired resilience targets at minimal costs (energy, power, execution time, area) by combining resilience techniques across various layers of the system stack (circuit, logic, architecture, software, algorithm). This is also referred to as cross-layer resilience. In this paper, we focus on radiation-induced soft errors in processor cores. We address both single-event upsets (SEUs) and single-event multiple upsets (SEMUs) in terrestrial environments. Our framework automatically and systematically explores the large space of comprehensive resilience techniques and their combinations across various layers of the system stack (586 cross-layer combinations in this paper), derives cost-effective solutions that achieve resilience…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
