Who Checks the Checker? Enhancing Component-level Architectural SEU Fault Tolerance for End-to-End SoC Protection
Michael Rogenmoser, Philippe Sauter, Chen Wu, Angelo Garofalo, Luca Benini

TL;DR
This paper proposes a combined fault-tolerance approach for SoC components, including interconnect and voting logic, to enhance SEU resilience with minimal overhead, demonstrated on a RISC-V microcontroller.
Contribution
It introduces an overlap-based method to protect both components and interconnections, improving end-to-end SEU fault tolerance with reduced area overhead.
Findings
Achieves over 99.9% fault tolerance in RTL and netlist.
Demonstrates 22% lower overhead compared to single global methods.
Validates approach through simulation-based fault injection and physical implementation.
Abstract
Single-event upset (SEU) fault tolerance for systems-on-chip (SoCs) in radiation-heavy environments is often addressed by architectural fault-tolerance approaches protecting individual SoC components (e.g., cores, memories) in isolation. However, the protection of voting logic and interconnections among components is also critical, as these become single points of failure in the design. We investigate combining multiple fault-tolerance approaches targeting individual SoC components, including interconnect and voting logic to ensure end-to-end SoC-level architectural SEU fault tolerance, while minimizing implementation area overheads. Enforcing an overlap between the protection methods ensures hardening of the whole design without gaps, while curtailing overheads. We demonstrate our approach on a RISC-V microcontroller SoC. SEU fault-tolerance is assessed with simulation-based fault…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
