Bio-inspired Agentic Self-healing Framework for Resilient Distributed Computing Continuum Systems
Alaa Saleh, Praveen Kumar Donta, Roberto Morabito, Sasu Tarkoma, Anders Lindgren, Qiyang Zhang, Schahram Dustdar, Susanna Pirttikangas, and Lauri Lov\'en

TL;DR
ReCiSt is a bio-inspired, agent-based self-healing framework for resilient distributed computing systems, enabling autonomous fault detection, diagnosis, and recovery inspired by biological processes, with rapid response times and minimal resource use.
Contribution
This paper introduces ReCiSt, a novel bio-inspired agentic framework that adapts biological healing phases into computational layers for autonomous resilience in complex distributed systems.
Findings
Self-healing achieved within tens of seconds
Minimum 10% CPU usage by agents
Effective fault diagnosis and resource reconfiguration
Abstract
Human biological systems sustain life through extraordinary resilience, continually detecting damage, orchestrating targeted responses, and restoring function through self-healing. Inspired by these capabilities, this paper introduces ReCiSt, a bio-inspired agentic self-healing framework designed to achieve resilience in Distributed Computing Continuum Systems (DCCS). Modern DCCS integrate heterogeneous computing resources, ranging from resource-constrained IoT devices to high-performance cloud infrastructures, and their inherent complexity, mobility, and dynamic operating conditions expose them to frequent faults that disrupt service continuity. These challenges underscore the need for scalable, adaptive, and self-regulated resilience strategies. ReCiSt reconstructs the biological phases of Hemostasis, Inflammation, Proliferation, and Remodeling into the computational layers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Modular Robots and Swarm Intelligence · Advanced Software Engineering Methodologies
