Handling of Memory Page Faults during Virtual-Address RDMA
Antonis Psistakis

TL;DR
This paper presents a hardware-software mechanism for handling memory page faults during RDMA operations, improving performance and memory utilization without relying on pinning or pre-faulting techniques.
Contribution
It introduces a novel page-fault handling mechanism integrated with the DMA engine, involving modifications to hardware and Linux drivers for efficient fault resolution.
Findings
Effective page-fault detection via ARM SMMU
Reduced programming complexity compared to pinning
Improved memory utilization and performance
Abstract
Nowadays, avoiding system calls during cluster communication (e.g., in Data Centers and High Performance Computing) in modern high-speed interconnection networks has become a necessity, due to the high overhead of multiple data copies between kernel and user space. User-level zero-copy Remote Direct Memory Access (RDMA) technologies address this problem by improving performance and reducing system energy consumption. However, traditional RDMA engines cannot tolerate page faults and therefore use various techniques to avoid them. State-of-the-art RDMA approaches typically rely on pinning address spaces or multiple pages per application. This method introduces long-term disadvantages due to increased programming complexity (pinning and unpinning buffers), limits on how much memory can be pinned, and inefficient memory utilization. In addition, pinning does not fully prevent page faults…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Distributed systems and fault tolerance · Advanced Data Storage Technologies
