Benchmarking and tuning the MILC code on clusters and supercomputers
Steven Gottlieb (Fermilab/Indiana University)

TL;DR
This paper benchmarks and optimizes the MILC code across various high-performance computing architectures, demonstrating significant speedups through simple code modifications tailored to advanced memory systems.
Contribution
It provides a comprehensive benchmarking and tuning methodology for the MILC code on diverse architectures, highlighting effective code changes for performance improvements.
Findings
Significant speedups achieved with simple code modifications
Performance benchmarks across multiple architectures including Itanium, PIV, Athlon, Alpha
Insights into optimizing code for advanced memory systems
Abstract
Recently, we have benchmarked and tuned the MILC code on a number of architectures including Intel Itanium and Pentium IV (PIV), dual-CPU Athlon, and the latest Compaq Alpha nodes. Results will be presented for many of these, and we shall discuss some simple code changes that can result in a very dramatic speedup of the KS conjugate gradient on processors with more advanced memory systems such as PIV, IBM SP and Alpha.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCatalytic Processes in Materials Science · Particle Detector Development and Performance · Ammonia Synthesis and Nitrogen Reduction
