Automated Instruction Stream Throughput Prediction for Intel and AMD   Microarchitectures

Jan Laukemann; Julian Hammer; Johannes Hofmann; Georg Hager; Gerhard; Wellein

arXiv:1809.00912·cs.PF·June 25, 2020

Automated Instruction Stream Throughput Prediction for Intel and AMD Microarchitectures

Jan Laukemann, Julian Hammer, Johannes Hofmann, Georg Hager, Gerhard, Wellein

PDF

2 Repos

TL;DR

This paper introduces OSACA, a static analysis tool that predicts instruction stream throughput on Intel and AMD microarchitectures, aiding performance modeling and understanding hardware-code interactions.

Contribution

The paper presents OSACA, a novel open-source static analysis tool for predicting in-core performance of x86 instruction streams on modern microarchitectures.

Findings

01

OSACA accurately predicts execution times for benchmark kernels.

02

Models built for Skylake and Zen architectures match measured performance.

03

The approach can be extended to future architectures.

Abstract

An accurate prediction of scheduling and execution of instruction streams is a necessary prerequisite for predicting the in-core performance behavior of throughput-bound loop kernels on out-of-order processor architectures. Such predictions are an indispensable component of analytical performance models, such as the Roofline and the Execution-Cache-Memory (ECM) model, and allow a deep understanding of the performance-relevant interactions between hardware architecture and loop code. We present the Open Source Architecture Code Analyzer (OSACA), a static analysis tool for predicting the execution time of sequential loops comprising x86 instructions under the assumption of an infinite first-level cache and perfect out-of-order scheduling. We show the process of building a machine model from available documentation and semi-automatic benchmarking, and carry it out for the latest Intel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.