A Behavioral Theory for Distributed Systems with Weak Recovery
Giovanni Fabbretti, Ivan Lanese, Jean-Bernard Stefani

TL;DR
This paper introduces a formal calculus for modeling distributed systems with crash failures and weak recovery, providing a behavioral theory to reason about system equivalences and fault-tolerance.
Contribution
It presents a novel calculus inspired by Erlang and distributed π-calculus, along with a behavioral theory and weak bisimilarity for reasoning about failures and recovery.
Findings
Developed a formal calculus for crash failures with weak recovery
Established a behavioral equivalence using labelled transition systems
Provided a compositional proof technique for system equivalence
Abstract
Distributed systems can be subject to various kinds of partial failures, therefore building fault-tolerance or failure mitigation mechanisms for distributed systems remains an important domain of research. In this paper, we present a calculus to formally model distributed systems subject to crash failures with recovery. The recovery model considered in the paper is weak, in the sense that it makes no assumption on the exact state in which a failed node resumes its execution, only its identity has to be distinguishable from past incarnations of itself. Our calculus is inspired in part by the Erlang programming language and in part by the distributed -calculus with nodes and link failures (DF) introduced by Francalanza and Hennessy. In order to reason about distributed systems with failures and recovery we develop a behavioral theory for our calculus, in the form of a contextual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed systems and fault tolerance
