An ECM-based energy-efficiency optimization approach for bandwidth-limited streaming kernels on recent Intel Xeon processors
Johannes Hofmann, Dietmar Fey

TL;DR
This paper presents an analytical approach combining low-level analysis, an extended ECM performance model, and hardware tuning to optimize energy efficiency for memory-bound streaming kernels on recent Intel Xeon processors, achieving significant energy savings.
Contribution
It extends the ECM model to include software optimizations and demonstrates an analytical method for energy-efficient tuning of memory-bound applications on modern CPUs.
Findings
Energy consumption reduced by 2.0-2.4× on tested processors.
The approach effectively identifies microarchitectural improvements for energy savings.
Using a 2D Jacobi solver as a case study, the method generalizes to other memory-bound applications.
Abstract
We investigate an approach that uses low-level analysis and the execution-cache-memory (ECM) performance model in combination with tuning of hardware parameters to lower energy requirements of memory-bound applications. The ECM model is extended appropriately to deal with software optimizations such as non-temporal stores. Using incremental steps and the ECM model, we analytically quantify the impact of various single-core optimizations and pinpoint microarchitectural improvements that are relevant to energy consumption. Using a 2D Jacobi solver as example that can serve as a blueprint for other memory-bound applications, we evaluate our approach on the four most recent Intel Xeon E5 processors (Sandy Bridge-EP, Ivy Bridge-EP, Haswell-EP, and Broadwell-EP). We find that chip energy consumption can be reduced in the range of 2.0-2.4 on the examined processors.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Cloud Computing and Resource Management
