Combining Fault Tolerance Techniques and COTS SoC Accelerators for Payload Processing in Space
Vasileios Leon, Elissaios Alexios Papatheofanous, George Lentaris, Charalampos Bezaitis, Nikolaos Mastorakis, Georgios Bampilis, Dionysios Reisis, Dimitrios Soudris

TL;DR
This paper investigates fault-tolerance methods for COTS accelerators like FPGA and VPU in space payloads, enhancing reliability for on-board computing amidst increasing computational demands.
Contribution
It introduces combined fault-tolerance techniques for FPGA and VPU devices and develops a fault-tolerant interface for co-processing in space applications.
Findings
Effective fault-tolerance strategies for FPGA and VPU devices.
Successful implementation of a fault-tolerant interface between FPGA and VPU.
Enhanced reliability of space payload processing systems.
Abstract
The ever-increasing demand for computational power and I/O throughput in space applications is transforming the landscape of on-board computing. A variety of Commercial-Off-The-Shelf (COTS) accelerators emerges as an attractive solution for payload processing to outperform the traditional radiation-hardened devices. Towards increasing the reliability of such COTS accelerators, the current paper explores and evaluates fault-tolerance techniques for the Zynq FPGA and the Myriad VPU, which are two device families being integrated in industrial space avionics architectures/boards, such as Ubotica's CogniSat, Xiphos' Q7S, and Cobham Gaisler's GR-VPX-XCKU060. On the FPGA side, we combine techniques such as memory scrubbing, partial reconfiguration, triple modular redundancy, and watchdogs. On the VPU side, we detect and correct errors in the instruction and data memories, as well as we apply…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
