TL;DR
nanoBench is a low-overhead, kernel-space tool for precise microbenchmarking of small code snippets and hardware performance counters on x86 systems, enabling detailed hardware analysis.
Contribution
It introduces nanoBench, a novel microbenchmarking tool capable of executing in kernel space with minimal overhead for accurate hardware performance measurements.
Findings
Measured latency, throughput, and port usage of 13,000+ instruction variants.
Characterized cache architectures and replacement policies across 11 Intel microarchitectures.
Abstract
We present nanoBench, a tool for evaluating small microbenchmarks using hardware performance counters on Intel and AMD x86 systems. Most existing tools and libraries are intended to either benchmark entire programs, or program segments in the context of their execution within a larger program. In contrast, nanoBench is specifically designed to evaluate small, isolated pieces of code. Such code is common in microbenchmark-based hardware analysis techniques. Unlike previous tools, nanoBench can execute microbenchmarks directly in kernel space. This allows to benchmark privileged instructions, and it enables more accurate measurements. The reading of the performance counters is implemented with minimal overhead avoiding functions calls and branches. As a consequence, nanoBench is precise enough to measure individual memory accesses. We illustrate the utility of nanoBench at the hand of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
