Coloured and task-based stencil codes
Benjamin Hazelwood, Tobias Weinzierl

TL;DR
This paper evaluates traditional and modern parallelization strategies for stencil codes on shared memory systems, proposing hybrid methods that combine colouring and task-based approaches for improved performance.
Contribution
It introduces two efficient hybrid parallelization methods that fuse colouring and task-based strategies for stencil codes on Cartesian grids.
Findings
Traditional multithreading strategies vary in performance on Broadwell and KNL.
Explicit data dependency specification in OpenMP influences task assignment.
Hybrid approaches outperform pure strategies in certain hardware configurations.
Abstract
Simple stencil codes are and remain an important building block in scientific computing. On shared memory nodes, they are traditionally parallelised through colouring or (recursive) tiling. New OpenMP versions alternatively allow users to specify data dependencies explicitly and to outsource the decision how to distribute the work to the runtime system. We evaluate traditional multithreading strategies on both Broadwell and KNL, study the arising assignment of tasks to threads and, from there, derive two efficient ways to parallelise stencil codes on regular Cartesian grids that fuse colouring and task-based approaches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Software Testing and Debugging Techniques
