A shared compilation stack for distributed-memory parallelism in stencil DSLs
George Bisbas, Anton Lydike, Emilien Bauer, Nick Brown and, Mathieu Fehr, Lawrence Mitchell, Gabriel Rodriguez-Canal, Maurice, Jamieson, Paul H. J. Kelly, Michel Steuwer, Tobias Grosser

TL;DR
This paper presents a shared compilation framework based on MLIR for distributed-memory parallelism in stencil DSLs, enabling shared infrastructure and high-performance code generation across multiple HPC stencil compilers.
Contribution
It adapts MLIR for HPC stencil DSLs, introducing new abstractions for message passing and demonstrating shared compiler components across three different stencil DSLs.
Findings
Shared compiler components improve development efficiency.
Framework generates high-performance distributed stencil code.
Shared infrastructure enhances scalability and maintainability.
Abstract
Domain Specific Languages (DSLs) increase programmer productivity and provide high performance. Their targeted abstractions allow scientists to express problems at a high level, providing rich details that optimizing compilers can exploit to target current- and next-generation supercomputers. The convenience and performance of DSLs come with significant development and maintenance costs. The siloed design of DSL compilers and the resulting inability to benefit from shared infrastructure cause uncertainties around longevity and the adoption of DSLs at scale. By tailoring the broadly-adopted MLIR compiler framework to HPC, we bring the same synergies that the machine learning community already exploits across their DSLs (e.g. Tensorflow, PyTorch) to the finite-difference stencil HPC community. We introduce new HPC-specific abstractions for message passing targeting distributed stencil…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmbedded Systems Design Techniques · Parallel Computing and Optimization Techniques · Interconnection Networks and Systems
