Leveraging TLA+ Specifications to Improve the Reliability of the ZooKeeper Coordination Service
Lingzhi Ouyang, Yu Huang, Binyu Huang, and Xiaoxing Ma

TL;DR
This paper demonstrates how formal TLA+ specifications can enhance the reliability of ZooKeeper by clarifying protocols, guiding testing, and identifying deep bugs beyond traditional testing methods.
Contribution
It introduces a multi-level specification approach for ZooKeeper using TLA+, integrating formal specs into the official project to improve reliability and documentation.
Findings
Formal specifications clarified protocol ambiguities
Deep bugs were identified that traditional testing missed
Specifications were integrated into the official ZooKeeper project
Abstract
ZooKeeper is a coordination service, widely used as a backbone of various distributed systems. Though its reliability is of critical importance, testing is insufficient for an industrial-strength system of the size and complexity of ZooKeeper, and deep bugs can still be found. To this end, we resort to formal TLA+ specifications to further improve the reliability of ZooKeeper. Our primary objective is usability and automation, rather than full verification. We incrementally develop three levels of specifications for ZooKeeper. We first obtain the protocol specification, which unambiguously specifies the Zab protocol behind ZooKeeper. We then proceed to a finer grain and obtain the system specification, which serves as the super-doc for system development. In order to further leverage the model-level specification to improve the reliability of the code-level implementation, we develop…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Software Testing and Debugging Techniques · Service-Oriented Architecture and Web Services
