Modeling the Potential of Message-Free Communication via CXL.mem
Stepan Vanecek, Matthew Turner, Manisha Gajbe, Matthew Wolf, and Martin Schulz

TL;DR
This paper introduces a performance evaluation toolchain and an extended model to predict the benefits of using CXL.mem for data exchange in HPC systems, focusing on MPI applications and enabling targeted optimizations.
Contribution
It presents a novel toolchain and performance model that analyze MPI data access patterns to identify potential speedups from CXL.mem integration.
Findings
The model accurately predicts potential performance gains from CXL.mem.
Validation on sample applications demonstrates the tool's effectiveness.
The approach enables targeted MPI optimizations for improved HPC performance.
Abstract
Heterogeneous memory technologies are increasingly important instruments in addressing the memory wall in HPC systems. While most are deployed in single node setups, CXL.mem is a technology that implements memories that can be attached to multiple nodes simultaneously, enabling shared memory pooling. This opens new possibilities, particularly for efficient inter-node communication. In this paper, we present a novel performance evaluation toolchain combined with an extended performance model for message-based communication, which can be used to predict potential performance benefits from using CXL.mem for data exchange. Our approach analyzes data access patterns of MPI applications: it analyzes on-node accesses to/from MPI buffers, as well as cross-node MPI traffic to gather a full understanding of the impact of memory performance. We combine this data in an extended performance model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Distributed systems and fault tolerance · Advanced Data Storage Technologies
