VINE -- A numerical code for simulating astrophysical systems using particles II: Implementation and performance characteristics
Andrew F. Nelson (1), M. Wetzstein (2,3), T. Naab (3,4). (1=LANL, HPC-5, 2=Dept of Astrophysical Sciences Princeton, 3=Universitaets Sternwarte, Muenchen, 4=Inst of Astronomy, Cambridge)

TL;DR
VINE is a highly optimized, versatile astrophysical simulation code that efficiently runs on various hardware architectures, demonstrating significant performance improvements and scalability for gravitational and SPH calculations.
Contribution
This paper details the implementation, hardware optimizations, and performance characteristics of VINE, a flexible astrophysical simulation code compatible with different parallel computing environments.
Findings
Performance improved by over an order of magnitude due to optimizations.
Nearly linear scalability up to 120 processors on shared memory systems.
VINE in GRAPE-tree mode is about twice as slow as host-only mode at similar accuracy.
Abstract
We continue our presentation of VINE. We begin with a description of relevant architectural properties of the serial and shared memory parallel computers on which VINE is intended to run, and describe their influences on the design of the code itself. We continue with a detailed description of a number of optimizations made to the layout of the particle data in memory and to our implementation of a binary tree used to access that data for use in gravitational force calculations and searches for SPH neighbor particles. We describe modifications to the code necessary to obtain forces efficiently from special purpose `GRAPE' hardware. We conclude with an extensive series of performance tests, which demonstrate that the code can be run efficiently and without modification in serial on small workstations or in parallel using OpenMP compiler directives on large scale, shared memory parallel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Spacecraft and Cryogenic Technologies · Distributed and Parallel Computing Systems
