Achieving algorithmic resilience for temporal integration through spectral deferred corrections
R.W. Grout, H. Kolla, M.L. Minion, J.B. Bell

TL;DR
This paper proposes using spectral deferred corrections (SDC) as an iterative method to enhance the resilience of numerical algorithms against soft hardware faults, demonstrated through tests on canonical problems and complex combustion simulations.
Contribution
It introduces a novel strategy leveraging SDC's iterative nature to recover from soft errors, improving algorithmic resilience in scientific computing.
Findings
SDC can recover from soft hardware faults effectively.
The proposed method maintains accuracy despite transient errors.
Successful application to complex combustion simulation code.
Abstract
Spectral deferred corrections (SDC) is an iterative approach for constructing higher- order accurate numerical approximations of ordinary differential equations. SDC starts with an initial approximation of the solution defined at a set of Gaussian or spectral collocation nodes over a time interval and uses an iterative application of lower-order time discretizations applied to a correction equation to improve the solution at these nodes. Each deferred correction sweep increases the formal order of accuracy of the method up to the limit inherent in the accuracy defined by the collocation points. In this paper, we demonstrate that SDC is well suited to recovering from soft (transient) hardware faults in the data. A strategy where extra correction iterations are used to recover from soft errors and provide algorithmic resilience is proposed. Specifically, in this approach the iteration is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
