Swift: Rethinking RDMA Control Plane for Elastic Computing
Junxue Zhang, Han Tian, Xinyang Huang, Wenxue Li, Kaiqiang Xu, Dian, Shen, Yong Wang, Kai Chen

TL;DR
This paper introduces Swift, a novel RDMA control plane design for elastic computing that leverages caching and process forking to significantly improve throughput and latency in serverless environments.
Contribution
Swift rethinks RDMA control plane assumptions, proposing cache-based connection setup and fork-based resource sharing to enhance elastic computing performance.
Findings
Achieves 30.56-46.50% higher throughput
Reduces latency by 18.55-37.21%
Adds only 6.5% control plane overhead
Abstract
Elastic computing enables dynamic scaling to meet workload demands, and Remote Direct Memory Access (RDMA) enhances this by providing high-throughput, low-latency network communication. However, integrating RDMA into elastic computing remains a challenge, particularly in control plane operations for RDMA connection setup. This paper revisits the assumptions of prior work on high-performance RDMA for elastic computing, and reveals that extreme microsecond-level control plane optimizations are often unnecessary. By challenging the conventional beliefs on the slowness of user-space RDMA control plane and the difficulty of user-space RDMA resource sharing, we uncover new design opportunities. Our key insight is that user-space RDMA connection setup can be significantly improved with caching, while RDMA resources can be efficiently shared among processes using fork. In light of this, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems
