Architectural choices for the Columbia 0.8 Teraflops machine
Igor V. Arsenin

TL;DR
This paper details the hardware design choices and technological innovations of a 16K-node supercomputer optimized for quantum chromodynamics calculations, emphasizing efficiency and software tools.
Contribution
It presents specific hardware design decisions, innovations, and software tools that enhance supercomputer performance for scientific computations.
Findings
Optimized architecture for full QCD calculations
Enhanced efficiency of conjugate gradient algorithm
Technological innovations facilitating hardware design
Abstract
We discuss the hardware design choices made in our 16K-node 0.8 Teraflops supercomputer project, a machine architecture optimized for full QCD calculations. The efficiency of the conjugate gradient algorithm in terms of balance of floating-point operations, memory handling and utilization, and communication overhead is addressed. We also discuss the technological innovations and software tools that facilitate hardware design and what opportunities these give to the academic community.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
