NumaPerf: Predictive and Full NUMA Profiling
Xin Zhao (University of Massachusetts Amherst), Jin Zhou (University, of Massachusetts Amherst), Hui Guan (University of Massachusetts Amherst),, Wei Wang (University of Texas at San Antonio), Xu Liu (North Carolina State, University)

TL;DR
NumaPerf is a novel NUMA profiling tool that predicts and identifies potential performance issues across architectures, focusing on memory sharing, thread migration, and load imbalance, leading to significant performance improvements.
Contribution
NumaPerf introduces a portable, architecture-agnostic profiling approach that detects a wider range of NUMA-related issues than existing tools.
Findings
Identifies more performance issues than existing profilers.
Achieves up to 5.94x performance speedup after bug fixes.
Effectively detects thread migration and load imbalance problems.
Abstract
Parallel applications are extremely challenging to achieve the optimal performance on the NUMA architecture, which necessitates the assistance of profiling tools. However, existing NUMA-profiling tools share some similar shortcomings, such as portability, effectiveness, and helpfulness issues. This paper proposes a novel profiling tool - NumaPerf - that overcomes these issues. NumaPerf aims to identify potential performance issues for any NUMA architecture, instead of only on the current hardware. To achieve this, NumaPerf focuses on memory sharing patterns between threads, instead of real remote accesses. NumaPerf further detects potential thread migrations and load imbalance issues that could significantly affect the performance but are omitted by existing profilers. NumaPerf also separates cache coherence issues that may require different fix strategies. Based on our extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Cloud Computing and Resource Management · Advanced Data Storage Technologies
