# GraviDy, a GPU modular, parallel direct-summation $N-$body integrator:   Dynamics with softening

**Authors:** Cristi\'an Maureira-Fredes, Pau Amaro-Seoane

arXiv: 1702.00440 · 2017-11-29

## TL;DR

GraviDy is a modular, GPU-accelerated direct-summation N-body integrator that significantly speeds up astrophysical simulations involving large particle numbers, with high parallel efficiency and ease of use.

## Contribution

This paper introduces a new GPU-based N-body integrator with high modularity, parallel performance, and open-source maintenance, improving computational speed over existing methods.

## Key findings

- Single GPU runs 200 times faster than CPU version
- Parallel 4-GPU run achieves 3x speed-up over single GPU
- Code is highly modular and suitable for large-scale astrophysical simulations

## Abstract

A wide variety of outstanding problems in astrophysics involve the motion of a large number of particles ($N\gtrsim 10^{6}$) under the force of gravity. These include the global evolution of globular clusters, tidal disruptions of stars by a massive black hole, the formation of protoplanets and the detection of sources of gravitational radiation. The direct-summation of $N$ gravitational forces is a complex problem with no analytical solution and can only be tackled with approximations and numerical methods. To this end, the Hermite scheme is a widely used integration method. With different numerical techniques and special-purpose hardware, it can be used to speed up the calculations. But these methods tend to be computationally slow and cumbersome to work with. Here we present a new GPU, direct-summation $N-$body integrator written from scratch and based on this scheme. This code has high modularity, allowing users to readily introduce new physics, it exploits available high-performance computing resources and will be maintained by public, regular updates. The code can be used in parallel on multiple CPUs and GPUs, with a considerable speed-up benefit. The single GPU version runs about 200 times faster compared to the single CPU version. A test run using 4 GPUs in parallel shows a speed up factor of about 3 as compared to the single GPU version. The conception and design of this first release is aimed at users with access to traditional parallel CPU clusters or computational nodes with one or a few GPU cards.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1702.00440/full.md

## Figures

16 figures with captions in the complete paper: https://tomesphere.com/paper/1702.00440/full.md

## References

57 references — full list in the complete paper: https://tomesphere.com/paper/1702.00440/full.md

---
Source: https://tomesphere.com/paper/1702.00440