CounterPoint: Using Hardware Event Counters to Refute and Refine Microarchitectural Assumptions (Extended Version)
Nick Lindsay (Yale University), Caroline Trippel (Stanford University), Anurag Khandelwal (Yale University), Abhishek Bhattacharjee (Yale University)

TL;DR
CounterPoint is a framework that uses hardware performance counters to validate and refine microarchitectural models, helping experts uncover undocumented hardware behaviors despite noisy data.
Contribution
It introduces a novel method for testing microarchitectural models against performance counters and identifying plausible hardware features explaining discrepancies.
Findings
Identified undocumented behaviors in Haswell's memory management.
Demonstrated CounterPoint's ability to refine microarchitectural understanding.
Revealed subtle hardware features previously hidden.
Abstract
Hardware event counters offer the potential to reveal not only performance bottlenecks but also detailed microarchitectural behavior. In practice, this promise is undermined by their vague specifications, opaque designs, and multiplexing noise, making event counter data hard to interpret. We introduce CounterPoint, a framework that tests user-specified microarchitectural models - expressed as path Decision Diagrams - for consistency with performance counter data. When mismatches occur, CounterPoint pinpoints plausible microarchitectural features that could explain them, using multi-dimensional counter confidence regions to mitigate multiplexing noise. We apply CounterPoint to the Haswell Memory Management Unit as a case study, shedding light on multiple undocumented and underdocumented microarchitectural behaviors. These include a load-store queue-side TLB prefetcher, merging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Parallel Computing and Optimization Techniques · Distributed systems and fault tolerance
