Dynamic FTSS in Asynchronous Systems: the Case of Unison
Swan Dubois (INRIA Rocquencourt, LIP6), Maria Potop-Butucaru (INRIA, Rocquencourt, LIP6), S\'ebastien Tixeuil (LIP6)

TL;DR
This paper investigates fault-tolerant self-stabilizing protocols for dynamic clock synchronization in asynchronous distributed systems, focusing on the unison problem, and presents impossibility results along with an optimal solution when feasible.
Contribution
It is the first to study FTSS protocols for dynamic tasks in asynchronous systems, specifically addressing the unison problem with new impossibility results and an optimal fault containment solution.
Findings
Many impossibility results for asynchronous unison
An FTSS solution with optimal fault containment when solvable
Extension of FTSS protocols to dynamic, asynchronous settings
Abstract
Distributed fault-tolerance can mask the effect of a limited number of permanent faults, while self-stabilization provides forward recovery after an arbitrary number of transient fault hit the system. FTSS protocols combine the best of both worlds since they are simultaneously fault-tolerant and self-stabilizing. To date, FTSS solutions either consider static (i.e. fixed point) tasks, or assume synchronous scheduling of the system components. In this paper, we present the first study of dynamic tasks in asynchronous systems, considering the unison problem as a benchmark. Unison can be seen as a local clock synchronization problem as neighbors must maintain digital clocks at most one time unit away from each other, and increment their own clock value infinitely often. We present many impossibility results for this difficult problem and propose a FTSS solution when the problem is solvable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed systems and fault tolerance · Real-Time Systems Scheduling · Parallel Computing and Optimization Techniques
