Checkpoint Placement for Systematic Fault-Injection Campaigns
Christian Dietrich, Tim-Marek Thomas, Matthias Mnich

TL;DR
This paper investigates optimal checkpoint placement in systematic fault-injection campaigns to significantly reduce forwarding cycles, using formal models and algorithms, leading to substantial efficiency improvements in resilience analysis.
Contribution
It formalizes the checkpoint placement problem as a maximum-weight reward path, proposes ILP and dynamic programming solutions, and introduces a genetic algorithm heuristic for optimal fault injection efficiency.
Findings
Reduced forwarding cycles by up to 99.934% with 16 checkpoints
Formalized checkpoint placement as a graph optimization problem
Developed ILP, dynamic programming, and heuristic methods
Abstract
Shrinking hardware structures and decreasing operating voltages lead to an increasing number of transient hardware faults,which thus become a core problem to consider for safety-critical systems. Here, systematic fault injection (FI), where one program-under-test is systematically stressed with faults, provides an in-depth resilience analysis in the presence of faults. However, FI campaigns require many independent injection experiments and, combined, long run times, especially if we aim for a high coverage of the fault space. One cost factor is the forwarding phase, which is the time required to bring the system-under test into the fault-free state at injection time. One common technique to speed up the forwarding are checkpoints of the fault-free system state at fixed points in time. In this paper, we show that the placement of checkpoints has a significant influence on the required…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
