Scaling of a Fast Fourier Transform and a Pseudo-spectral Fluid Solver up to 196608 cores
Anando G. Chatterjee, Mahendra K. Verma, Abhishek Kumar, Ravi, Samtaney, Bilel Hadri, and Rooh Khurram

TL;DR
This paper evaluates the scalability of FFT and pseudospectral fluid solvers on large supercomputers, demonstrating near-linear scaling up to hundreds of thousands of cores and analyzing communication bottlenecks.
Contribution
It provides detailed scaling analysis of FFTK and Tarang on large-scale supercomputers, highlighting communication challenges and performance characteristics.
Findings
Communication dominates computation at large scales.
Computation scales as p^{-1}, communication as n^{-eta}.
Near-ideal weak and strong scaling up to 196608 cores.
Abstract
In this paper we present scaling results of a FFT library, FFTK, and a pseudospectral code, Tarang, on grid resolutions up to grid using 65536 cores of Blue Gene/P and 196608 cores of Cray XC40 supercomputers. We observe that communication dominates computation, more so on the Cray XC40. The computation time scales as , and the communication time as with ranging from 0.7 to 0.9 for Blue Gene/P, and from 0.43 to 0.73 for Cray XC40. FFTK, and the fluid and convection solvers of Tarang exhibit weak as well as strong scaling nearly up to 196608 cores of Cray XC40. We perform a comparative study of the performance on the Blue Gene/P and Cray XC40 clusters.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
