Graphic-Card Cluster for Astrophysics (GraCCA) -- Performance Tests
Hsi-Yu Schive, Chia-Hung Chien, Shing-Kwong Wong, Yu-Chih Tsai, and, Tzihong Chiueh

TL;DR
The GraCCA system is a high-performance graphic-card cluster designed for astrophysics simulations, achieving over 7 TFLOPS in practice and outperforming traditional clusters like GRAPE-6A in speed and cost-efficiency.
Contribution
This paper introduces the GraCCA cluster architecture and demonstrates its high computational performance and efficiency in astrophysics simulations, surpassing existing solutions.
Findings
Achieves 7.1 TFLOPS performance in astrophysics simulations.
Provides a cost-effective solution with high parallel efficiency.
Handles up to 320 million particles in simulations.
Abstract
In this paper, we describe the architecture and performance of the GraCCA system, a Graphic-Card Cluster for Astrophysics simulations. It consists of 16 nodes, with each node equipped with 2 modern graphic cards, the NVIDIA GeForce 8800 GTX. This computing cluster provides a theoretical performance of 16.2 TFLOPS. To demonstrate its performance in astrophysics computation, we have implemented a parallel direct N-body simulation program with shared time-step algorithm in this system. Our system achieves a measured performance of 7.1 TFLOPS and a parallel efficiency of 90% for simulating a globular cluster of 1024K particles. In comparing with the GRAPE-6A cluster at RIT (Rochester Institute of Technology), the GraCCA system achieves a more than twice higher measured speed and an even higher performance-per-dollar ratio. Moreover, our system can handle up to 320M particles and can serve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
