The fault-tolerant cluster-sending problem
Jelle Hellings, Mohammad Sadoghi

TL;DR
This paper formalizes the cluster-sending problem in Byzantine distributed systems, establishes its lower bounds, and develops optimal protocols, advancing fault-tolerant communication primitives beyond consensus.
Contribution
It introduces the cluster-sending problem, proves its complexity bounds, and provides practical protocols that achieve optimal efficiency in Byzantine fault-tolerant systems.
Findings
Lower bounds established for crash and Byzantine failures.
Practical protocols matching lower bounds are developed.
Provides foundational insights for fault-tolerant distributed communication.
Abstract
The development of fault-tolerant distributed systems that can tolerate Byzantine behavior has traditionally been focused on consensus protocols, which support fully-replicated designs. For the development of more sophisticated high-performance Byzantine distributed systems, more specialized fault-tolerant communication primitives are necessary, however. In this paper, we identify an essential communication primitive and study it in depth. In specifics, we formalize the cluster-sending problem, the problem of sending a message from one Byzantine cluster to another Byzantine cluster in a reliable manner. We not only formalize this fundamental problem, but also establish lower bounds on the complexity of this problem under crash failures and Byzantine failures. Furthermore, we develop practical cluster-sending protocols that meet these lower bounds and, hence, have optimal complexity.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed systems and fault tolerance · Modular Robots and Swarm Intelligence · DNA and Biological Computing
