The Chamomile Scheme: An Optimized Algorithm for N-body simulations on Programmable Graphics Processing Units
Tsuyoshi Hamada, Toshiaki Iitaka

TL;DR
The paper introduces the Chamomile Scheme, an optimized GPU algorithm for gravitational N-body simulations, achieving high performance on NVIDIA GeForce8800GTX hardware with specific memory and floating point constraints.
Contribution
It presents a novel algorithm tailored for GPU architectures, enabling efficient gravitational calculations and a high-performance simulation library.
Findings
Achieved 173 Gflop/s for 2048 particles
Reached 256 Gflop/s for 131072 particles
Optimized for NVIDIA GeForce8800GTX hardware
Abstract
We present an algorithm named "Chamomile Scheme". The scheme is fully optimized for calculating gravitational interactions on the latest programmable Graphics Processing Unit (GPU), NVIDIA GeForce8800GTX, which has (a) small but fast shared memories (16 K Bytes * 16) with no broadcasting mechanism and (b) floating point arithmetic hardware of 500 Gflop/s but only for single precision. Based on this scheme, we have developed a library for gravitational N-body simulations, "CUNBODY-1", whose measured performance reaches to 173 Gflop/s for 2048 particles and 256 Gflop/s for 131072 particles.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Research and Discoveries · Geophysics and Gravity Measurements · Computational Physics and Python Applications
