Modeling Shared Cache Performance of OpenMP Programs using Reuse Distance
Atanu Barai, Gopinath Chennupati, Nandakishore Santhi and, Abdel-Hameed A. Badawy, Stephan Eidenbenz

TL;DR
This paper introduces a scalable probabilistic model to predict shared cache performance of OpenMP programs by analyzing reuse distance profiles from sequential memory traces, enabling accurate performance estimation on multicore systems.
Contribution
It presents a novel static, probabilistic approach to model shared cache reuse distances for parallel applications, improving prediction efficiency and accuracy.
Findings
Accurate shared cache hit-rate predictions
Effective static analysis from sequential traces
Scalable performance modeling approach
Abstract
Performance modeling of parallel applications on multicore computers remains a challenge in computational co-design due to the complex design of multicore processors including private and shared memory hierarchies. We present a Scalable Analytical Shared Memory Model to predict the performance of parallel applications that runs on a multicore computer and shares the same level of cache in the hierarchy. This model uses a computationally efficient, probabilistic method to predict the reuse distance profiles, where reuse distance is a hardware architecture-independent measure of the patterns of virtual memory accesses. It relies on a stochastic, static basic block-level analysis of reuse profiles measured from the memory traces of applications ran sequentially on small instances rather than using a multi-threaded trace. The results indicate that the hit-rate predictions on the shared…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Distributed and Parallel Computing Systems
