Improving Memory Hierarchy Utilisation for Stencil Computations on Multicore Machines
Alexandre Sena, Aline Nascimento, Cristina Boeres, Vinod E., F. Rebello, Andr\'e Bulc\~ao

TL;DR
This paper proposes an algorithm to optimize block sizes for MPI stencil computations on multicore systems, improving memory hierarchy utilization through an experimental methodology.
Contribution
It introduces a novel algorithm and methodology for selecting efficient block sizes to enhance memory hierarchy use in multicore stencil computations.
Findings
Identifies optimal block sizes for MPI stencil computations.
Demonstrates improved memory utilization on multicore architectures.
Provides an experimental framework for performance evaluation.
Abstract
Although modern supercomputers are composed of multicore machines, one can find scientists that still execute their legacy applications which were developed to monocore cluster where memory hierarchy is dedicated to a sole core. The main objective of this paper is to propose and evaluate an algorithm that identify an efficient blocksize to be applied on MPI stencil computations on multicore machines. Under the light of an extensive experimental analysis, this work shows the benefits of identifying blocksizes that will dividing data on the various cores and suggest a methodology that explore the memory hierarchy available in modern machines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Neural Networks and Applications · Distributed and Parallel Computing Systems
