Enabling Efficient RDMA-based Synchronous Mirroring of Persistent Memory Transactions
Arash Tavakkol, Aasheesh Kolli, Stanko Novakovic, Kaveh Razavi, Juan, Gomez-Luna, Hasan Hassan, Claude Barthels, Yaohua Wang, Mohammad Sadrosadati,, Saugata Ghose, Ankit Singla, Pratap Subrahmanyam, and Onur Mutlu

TL;DR
This paper investigates the performance of RDMA-based synchronous mirroring for persistent memory, identifies limitations of existing primitives, and proposes new techniques to enhance efficiency and correctness in high-performance storage systems.
Contribution
It introduces new RDMA primitives and techniques specifically designed for efficient and correct synchronous mirroring of persistent memory systems.
Findings
Existing RDMA primitives do not fully exploit hardware asynchronous capabilities.
Proposed primitives enable more efficient and correct SM over RDMA.
New techniques significantly improve performance of persistent memory mirroring.
Abstract
Synchronous Mirroring (SM) is a standard approach to building highly-available and fault-tolerant enterprise storage systems. SM ensures strong data consistency by maintaining multiple exact data replicas and synchronously propagating every update to all of them. Such strong consistency provides fault tolerance guarantees and a simple programming model coveted by enterprise system designers. For current storage devices, SM comes at modest performance overheads. This is because performing both local and remote updates simultaneously is only marginally slower than performing just local updates, due to the relatively slow performance of accesses to storage in today's systems. However, emerging persistent memory and ultra-low-latency network technologies necessitate a careful re-evaluation of the existing SM techniques, as these technologies present fundamentally different latency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Cloud Computing and Resource Management · Distributed systems and fault tolerance
