From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation
Marek Blazewicz (1, 2), Ian Hinder (3), David M. Koppelman (4 and, 5), Steven R. Brandt (4, 6) Milosz Ciznicki (1), Michal Kierzynka (1 and, 2), Frank L\"offler (4), Erik Schnetter (7, 8, 4), Jian Tao (4) ((1), Pozna\'n Supercomputing, Networking Center

TL;DR
Chemora is a framework that automates the translation of high-level PDE descriptions into optimized, high-performance code for diverse architectures, enabling complex simulations like black hole collisions without manual low-level tuning.
Contribution
It introduces a novel framework that extends Cactus for automated, high-level code generation and optimization across CPU and GPU systems for complex PDE-based applications.
Findings
Successfully simulated black hole collisions using Chemora
Achieved efficient parallelism with MPI and multi-threading
Demonstrated high performance without low-level code tuning
Abstract
Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization is based on higher-order finite differences on multi-block domains. Chemora's capabilities are demonstrated by simulations of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Physics and Python Applications · Parallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems
