GPU-Based Parallel Integration of Large Numbers of Independent ODE Systems
Kyle E Niemeyer, Chih-Jen Sung

TL;DR
This paper presents GPU-based implementations of explicit and semi-implicit algorithms for efficiently integrating large numbers of independent ODE systems, demonstrating significant performance improvements over CPU methods.
Contribution
It introduces GPU-optimized Runge-Kutta-Cash-Karp and Runge-Kutta-Chebyshev algorithms for large-scale ODE integration, addressing stiffness and performance challenges.
Findings
RKC handles moderate stiffness efficiently on GPUs.
GPU implementations outperform CPU counterparts significantly.
Performance comparisons show substantial speedups with GPU-based methods.
Abstract
The task of integrating a large number of independent ODE systems arises in various scientific and engineering areas. For nonstiff systems, common explicit integration algorithms can be used on GPUs, where individual GPU threads concurrently integrate independent ODEs with different initial conditions or parameters. One example is the fifth-order adaptive Runge-Kutta-Cash-Karp (RKCK) algorithm. In the case of stiff ODEs, standard explicit algorithms require impractically small time-step sizes for stability reasons, and implicit algorithms are therefore commonly used instead to allow larger time steps and reduce the computational expense. However, typical high-order implicit algorithms based on backwards differentiation formulae (e.g., VODE, LSODE) involve complex logical flow that causes severe thread divergence when implemented on GPUs, limiting the performance. Therefore, alternate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
