Beyond Optimal Fault Tolerance
Andrew Lewis-Pye, Tim Roughgarden

TL;DR
This paper explores the limits of fault-tolerance in state machine replication protocols when allowing a bounded number of consistency violations, providing bounds and recovery methods under relaxed assumptions.
Contribution
It introduces a framework for achieving better-than-optimal fault-tolerance by relaxing consistency constraints and presents matching bounds and a generic recovery procedure.
Findings
Bounding rollback is impossible without timing assumptions.
Optimal recoverable fault-tolerance depends on the number of allowed violations.
Recovery procedures can restore consistency with limited rollback.
Abstract
The optimal fault-tolerance achievable by any protocol has been characterized in a wide range of settings. For example, for state machine replication (SMR) protocols operating in the partially synchronous setting, it is possible to simultaneously guarantee consistency against -bounded adversaries (i.e., adversaries that control less than an fraction of the participants) and liveness against -bounded adversaries if and only if . This paper characterizes to what extent "better-than-optimal" fault-tolerance guarantees are possible for SMR protocols when the standard consistency requirement is relaxed to allow a bounded number of consistency violations. We prove that bounding rollback is impossible without additional timing assumptions and investigate protocols that tolerate and recover from consistency violations whenever message delays…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems
