Cornus: Atomic Commit for a Cloud DBMS with Storage Disaggregation (Extended Version)
Zhihan Guo, Xinyu Zeng, Kan Wu, Wuh-Chwen Hwang, Ziwei Ren, Xiangyao, Yu, Mahesh Balakrishnan, Philip A. Bernstein

TL;DR
Cornus is a new atomic commit protocol designed for cloud database systems with storage disaggregation, significantly reducing latency and blocking issues inherent in traditional two-phase commit (2PC) methods.
Contribution
It introduces Cornus, an optimized 2PC protocol leveraging compare-and-swap in disaggregated storage, addressing 2PC's latency and blocking limitations with proofs and real-world deployment.
Findings
Achieves up to 1.9x latency speedup over traditional 2PC
Addresses blocking during coordinator failure in disaggregated storage
Proven correctness through formal proofs
Abstract
Two-phase commit (2PC) is widely used in distributed databases to ensure the atomicity of distributed transactions. However, 2PC has two limitations. First, it requires two eager log writes on the critical path, which incurs significant latency. Second, when a coordinator fails, a participant may be blocked waiting for the coordinator's decision, leading to indefinitely long latency and low throughput. 2PC was originally designed for a shared-nothing architecture. We observe that the two problems above can be addressed in an emerging storage disaggregation architecture which provides compare-and-swap capability in the storage layer. We propose Cornus, an optimized 2PC protocol for Cloud DBMS with Storage Disaggregation. We present Cornus in detail with proofs and show how it addresses the two limitations in 2PC. We also deploy it on real storage services including Azure Blob Storage and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed systems and fault tolerance · Cloud Computing and Resource Management · Advanced Data Storage Technologies
