RDMA vs. RPC for Implementing Distributed Data Structures
Benjamin Brock, Yuxin Chen, Jiakun Yan, John D. Owens, Ayd{\i}n, Bulu\c{c}, and Katherine Yelick

TL;DR
This paper compares RDMA and RPC approaches for distributed data structures, analyzing their performance trade-offs through microbenchmarks, a performance model, and real-world experiments to guide implementation choices.
Contribution
It provides a detailed performance analysis and modeling of RDMA versus RPC for distributed data structure operations, aiding developers and network architects.
Findings
RDMA offers lower latency and overhead due to hardware support.
RPC is more expressive but incurs higher costs and less attentiveness.
The performance model helps in selecting optimal implementation strategies.
Abstract
Distributed data structures are key to implementing scalable applications for scientific simulations and data analysis. In this paper we look at two implementation styles for distributed data structures: remote direct memory access (RDMA) and remote procedure call (RPC). We focus on operations that require individual accesses to remote portions of a distributed data structure, e.g., accessing a hash table bucket or distributed queue, rather than global operations in which all processors collectively exchange information. We look at the trade-offs between the two styles through microbenchmarks and a performance model that approximates the cost of each. The RDMA operations have direct hardware support in the network and therefore lower latency and overhead, while the RPC operations are more expressive but higher cost and can suffer from lack of attentiveness from the remote side. We also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
