Automatic Throughput and Critical Path Analysis of x86 and ARM Assembly   Kernels

Jan Laukemann; Julian Hammer; Georg Hager; Gerhard Wellein

arXiv:1910.00214·cs.PF·June 25, 2020

Automatic Throughput and Critical Path Analysis of x86 and ARM Assembly Kernels

Jan Laukemann, Julian Hammer, Georg Hager, Gerhard Wellein

PDF

2 Repos

TL;DR

This paper extends the OSACA static analysis tool to support ARM architectures and critical path prediction, enabling cross-architecture performance modeling of loop kernels on various micro-architectures.

Contribution

The authors significantly enhanced OSACA to include ARM support and critical path analysis, improving its capability for accurate, architecture-agnostic performance predictions.

Findings

01

Predictions closely match actual runtime measurements.

02

Extended OSACA supports multiple architectures including x86 and ARM.

03

Critical path analysis improves understanding of performance bottlenecks.

Abstract

Useful models of loop kernel runtimes on out-of-order architectures require an analysis of the in-core performance behavior of instructions and their dependencies. While an instruction throughput prediction sets a lower bound to the kernel runtime, the critical path defines an upper bound. Such predictions are an essential part of analytic (i.e., white-box) performance models like the Roofline and Execution-Cache-Memory (ECM) models. They enable a better understanding of the performance-relevant interactions between hardware architecture and loop code. The Open Source Architecture Code Analyzer (OSACA) is a static analysis tool for predicting the execution time of sequential loops. It previously supported only x86 (Intel and AMD) architectures and simple, optimistic full-throughput execution. We have heavily extended OSACA to support ARM instructions and critical path prediction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.