Double-precision FPUs in High-Performance Computing: an Embarrassment of Riches?
Jens Domke, Kazuaki Matsumura, Mohamed Wahib, Haoyu Zhang, Keita, Yashima, Toshiki Tsuchikawa, Yohei Tsuji, Artur Podobas, Satoshi Matsuoka

TL;DR
This study empirically evaluates the necessity of extensive double-precision hardware in HPC applications, revealing that many can tolerate reduced double-precision support without performance loss, challenging traditional assumptions.
Contribution
It provides a comprehensive comparison of HPC proxy applications on two similar processors with different double-precision hardware allocations, quantifying the impact of reduced double-precision support.
Findings
Significant reduction in double-precision hardware can be achieved with minimal performance impact.
Many HPC applications do not require full double-precision support.
Results support industry trends towards hybrid-precision hardware units.
Abstract
Among the (uncontended) common wisdom in High-Performance Computing (HPC) is the applications' need for large amount of double-precision support in hardware. Hardware manufacturers, the TOP500 list, and (rarely revisited) legacy software have without doubt followed and contributed to this view. In this paper, we challenge that wisdom, and we do so by exhaustively comparing a large number of HPC proxy application on two processors: Intel's Knights Landing (KNL) and Knights Mill (KNM). Although similar, the KNM and KNL architecturally deviate at one important point: the silicon area devoted to double-precision arithmetic's. This fortunate discrepancy allows us to empirically quantify the performance impact in reducing the amount of hardware double-precision arithmetic. Our analysis shows that this common wisdom might not always be right. We find that the investigated HPC proxy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNumerical Methods and Algorithms · Parallel Computing and Optimization Techniques · Advanced Data Storage Technologies
