uops.info: Characterizing Latency, Throughput, and Port Usage of Instructions on Intel Microarchitectures
Andreas Abel, Jan Reineke

TL;DR
This paper introduces uops.info, a tool that accurately models instruction latency, throughput, and port usage on Intel microarchitectures, aiding performance prediction and optimization.
Contribution
The paper presents novel algorithms for inferring instruction behavior metrics and provides a machine-readable database across multiple Intel architectures.
Findings
Models differ significantly from prior work in some cases.
Provides detailed instruction performance data for all Intel Core generations.
Enables more accurate performance predictions and compiler optimizations.
Abstract
Modern microarchitectures are some of the world's most complex man-made systems. As a consequence, it is increasingly difficult to predict, explain, let alone optimize the performance of software running on such microarchitectures. As a basis for performance predictions and optimizations, we would need faithful models of their behavior, which are, unfortunately, seldom available. In this paper, we present the design and implementation of a tool to construct faithful models of the latency, throughput, and port usage of x86 instructions. To this end, we first discuss common notions of instruction throughput and port usage, and introduce a more precise definition of latency that, in contrast to previous definitions, considers dependencies between different pairs of input and output operands. We then develop novel algorithms to infer the latency, throughput, and port usage based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
