Data access optimizations for highly threaded multi-core CPUs with multiple memory controllers
Georg Hager, Thomas Zeiser, Gerhard Wellein

TL;DR
This paper investigates how data access patterns affect performance on multi-core CPUs with multiple memory controllers, proposing optimizations to reduce bottlenecks through data layout strategies.
Contribution
It introduces specific data layout and padding techniques to mitigate memory bottlenecks in multi-core architectures with multiple memory controllers.
Findings
Performance bottlenecks are caused by cache thrashing and aliasing conflicts.
Careful data layout and padding can significantly improve performance.
Analysis on Sun UltraSPARC T2 demonstrates effectiveness of proposed optimizations.
Abstract
Processor and system architectures that feature multiple memory controllers are prone to show bottlenecks and erratic performance numbers on codes with regular access patterns. Although such effects are well known in the form of cache thrashing and aliasing conflicts, they become more severe when memory access is involved. Using the new Sun UltraSPARC T2 processor as a prototypical multi-core design, we analyze performance patterns in low-level and application benchmarks and show ways to circumvent bottlenecks by careful data layout and padding.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Ferroelectric and Negative Capacitance Devices
