CPU Simulation Using Two-Phase Stratified Sampling
Magnus Ekman

TL;DR
This paper analyzes the accuracy of SimPoint in CPU simulation, revealing significant errors with typical usage and proposing a two-phase stratified sampling method that greatly improves accuracy and efficiency.
Contribution
It introduces a two-phase stratified sampling approach for CPU simulation that reduces error and sample size compared to traditional methods.
Findings
SimPoint can cause 40-60% prediction errors with 20 samples
Two-phase sampling reduces maximum error to 3%
Order-of-magnitude reduction in sample size needed
Abstract
Simulation remains a cornerstone of computer architecture research, yet full end-to-end application execution is prohibitively time-consuming. The industry-standard solution, SimPoint, mitigates this cost by selecting a small number of representative code regions that capture program phase behavior. In this work, we take a fresh look at phase behavior in the SPEC CPU 2017 Integer suite to assess how pronounced such behavior truly is and what accuracy can be expected from typical SimPoint usage. Based on previously published data, we argue that common SimPoint counts can induce substantial estimation errors. To explore this further, we recast SimPoint as a stratified sampling problem, which enables the derivation of a conservative confidence interval. The analysis indicates that significant errors are expected, and our empirical analysis confirms this: with 20 SimPoints, two applications…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Numerical Methods and Algorithms · Embedded Systems Design Techniques
