DOLMA: A Data Object Level Memory Disaggregation Framework for HPC Applications
Haoyu Zheng, Shouwei Gao, Jie Ren, Wenqian Dong

TL;DR
DOLMA is a framework that enhances memory disaggregation in HPC by intelligently offloading data objects to remote memory, balancing performance and memory utilization with minimal degradation.
Contribution
DOLMA introduces a data object level memory disaggregation framework that leverages predictable access patterns and prefetching to improve HPC memory utilization.
Findings
Limits performance degradation to less than 16%.
Reduces local memory usage by up to 63%.
Effective for eight HPC workloads and kernels.
Abstract
Memory disaggregation is promising to scale memory capacity and improves utilization in HPC systems. However, the performance overhead of accessing remote memory poses a significant challenge, particularly for compute-intensive HPC applications where execution times are highly sensitive to data locality. In this work, we present DOLMA, a Data Object Level M emory dis Aggregation framework designed for HPC applications. DOLMA intelligently identifies and offloads data objects to remote memory, while providing quantitative analysis to decide a suitable local memory size. Furthermore, DOLMA leverages the predictable memory access patterns typical in HPC applications and enables remote memory prefetch via a dual-buffer design. By carefully balancing local and remote memory usage and maintaining multi-thread concurrency, DOLMA provides a flexible and efficient solution for leveraging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Advanced Data Storage Technologies · Parallel Computing and Optimization Techniques
