Optimizing Large-Scale ODE Simulations
Mario Mulansky

TL;DR
This paper introduces cache optimization and SIMD techniques to significantly accelerate large-scale Runge-Kutta ODE simulations, transforming them from bandwidth-bound to CPU-bound for up to threefold performance gains.
Contribution
It presents a novel cache-aware clustering strategy and demonstrates how SIMD instructions can further enhance simulation efficiency in large-scale ODE computations.
Findings
Performance increased by up to a factor of three.
Cache optimization shifts the bottleneck from memory bandwidth to CPU processing.
Implementation with Boost libraries simplifies applying these optimizations.
Abstract
We present a strategy to speed up Runge-Kutta-based ODE simulations of large systems with nearest-neighbor coupling. We identify the cache/memory bandwidth as the crucial performance bottleneck. To reduce the required bandwidth, we introduce a granularity in the simulation and identify the optimal cluster size in a performance study. This leads to a considerable performance increase and transforms the algorithm from bandwidth bound to CPU bound. By additionally employing SIMD instructions we are able to boost the efficiency even further. In the end, a total performance increase of up to a factor three is observed when using cache optimization and SIMD instructions compared to a standard implementation. All simulation codes are written in C++ and made publicly available. By using the modern C++ libraries Boost.odeint and Boost.SIMD, these optimizations can be implemented with minimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications · Numerical methods for differential equations · Advanced Data Storage Technologies
