Prototype of Fault Adaptive Embedded Software for Large-Scale Real-Time Systems
Derek Messie (1), Mina Jung (1), Jae C. Oh (1), Shweta Shetty (2),, Steven Nordstrom (2), Michael Haney (3) ((1) Syracuse University, (2), Vanderbilt University, (3) University of Illinois at Urbana-Champaign)

TL;DR
This paper presents a prototype of fault adaptive embedded software for large-scale real-time systems, featuring self-optimizing agents and self-healing objects to enhance reliability and fault mitigation in complex environments.
Contribution
It introduces a distributed, self-adaptive fault mitigation approach with self-optimizing agents and self-healing objects for large-scale real-time embedded systems.
Findings
Successfully demonstrated proactive and reactive fault mitigation.
Enabled real-time detection and reaction to failures.
Supported large-scale deployment with 2500 DSPs.
Abstract
This paper describes a comprehensive prototype of large-scale fault adaptive embedded software developed for the proposed Fermilab BTeV high energy physics experiment. Lightweight self-optimizing agents embedded within Level 1 of the prototype are responsible for proactive and reactive monitoring and mitigation based on specified layers of competence. The agents are self-protecting, detecting cascading failures using a distributed approach. Adaptive, reconfigurable, and mobile objects for reliablility are designed to be self-configuring to adapt automatically to dynamically changing environments. These objects provide a self-healing layer with the ability to discover, diagnose, and react to discontinuities in real-time processing. A generic modeling environment was developed to facilitate design and implementation of hardware resource specifications, application data flow, and failure…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmbedded Systems and FPGA Design · Real-time simulation and control systems · Real-Time Systems Scheduling
