CXLMemUring: A Hardware Software Co-design Paradigm for Asynchronous and Flexible Parallel CXL Memory Pool Access
Yiwei Yang

TL;DR
This paper introduces CXLMemUring, a hardware-software co-design paradigm that enhances asynchronous, flexible parallel CXL memory pool access by offloading memory operations to CXL endpoints and near-device cores, improving memory capacity and efficiency.
Contribution
It proposes a novel hardware-software co-design approach for CXL memory access, including adaptive code generation and profiling-guided updates for long-running jobs.
Findings
Evaluation with modified BOOMv3 shows promising results.
Simulation of CXL endpoint access demonstrates feasibility.
Adaptive code offloading improves memory operation efficiency.
Abstract
CXL has been the emerging technology for expanding memory for both the host CPU and device accelerators with load/store interface. Extending memory coherency to the PCIe root complex makes the codesign more flexible in that you can access the memory with coherency using your near-device computability. Since the capacity demand with tolerable latency and bandwidth is growing, we need to come up with a new hardware-software codesign way to offload the synthesized memory operations to the CXL endpoint, CXL switch or near CXL root complex cores like Intel DSA to fetch data; the CPU or accelerators can calculate other stuff in the backend. On CXL done loading, the data will be put into L1 if capacity fits, and the in-core ROB will be notified by mailbox and resume the calculation on the previous hardware context. Since the distance(timing window) of the load instruction sequence is unknown,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems · Advanced Data Storage Technologies
