A Tree-Structured Two-Phase Commit Framework for OceanBase: Optimizing Scalability and Consistency
Quanqing Xu, Chen Qian, Chuanhui Yang, Fanyu Kong, Guixiang Liu, Fusheng Han, and Zixiang Zhai

TL;DR
This paper presents a tree-structured two-phase commit framework for OceanBase that significantly reduces coordination overhead, handles dynamic partition transfers efficiently, and maintains strong consistency in distributed databases.
Contribution
It introduces a novel log stream-based participant model, a tree-shaped 2PC protocol with dynamic topology handling, and new states to prevent consistency violations during retries.
Findings
99% reduction in coordination overhead for 100 partitions
Performance approaches single-machine transaction latency
Effective handling of partition migrations during transactions
Abstract
Modern distributed databases face challenges in achieving transactional consistency across distributed partitions. Traditional two-phase commit (2PC) protocols incur high coordination overhead and latency, and require complex recovery for dynamic partition transfers. This paper introduces a novel tree-shaped 2PC framework for OceanBase that leverages single-machine log streams to address these challenges through three innovations. First, we propose log streams as atomic participants, replacing partition-level coordination. By treating each log stream as the commit unit, a transaction spanning co-located partitions interacts with one participant, reducing coordination overhead by orders of magnitude (e.g., 99 percent reduction for ). Second, we design a tree-shaped 2PC protocol with coordinator-rooted DAG topology that dynamically handles partition transfers by recursively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed systems and fault tolerance · Cloud Computing and Resource Management · Software System Performance and Reliability
