A Knowledge-Theoretic Analysis of Uniform Distributed Coordination and Failure Detectors
Joseph Y. Halpern, Aleta Ricciardi

TL;DR
This paper establishes a precise equivalence between achieving Uniform Distributed Coordination in unreliable systems and the existence of certain failure detectors, emphasizing the role of process knowledge about faults.
Contribution
It provides a knowledge-theoretic characterization of the conditions under which UDC can be achieved with or without bounds on faulty processes, linking failure detectors to process knowledge.
Findings
Perfect failure detectors are necessary and sufficient for UDC with no fault bound.
A generalized failure detector is required for UDC with a bounded number of faulty processes.
Knowledge about which processes are faulty is crucial for achieving UDC.
Abstract
It is shown that, in a precise sense, if there is no bound on the number of faulty processes in a system with unreliable but fair communication, Uniform Distributed Coordination (UDC) can be attained if and only if a system has perfect failure detectors. This result is generalized to the case where there is a bound t on the number of faulty processes. It is shown that a certain type of generalized failure detector is necessary and sufficient for achieving UDC in a context with at most t faulty processes. Reasoning about processes' knowledge as to which other processes are faulty plays a key role in the analysis.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed systems and fault tolerance · Petri Nets in System Modeling · Optimization and Search Problems
