How to keep pushing ML accelerator performance? Know your rooflines!

Marian Verhelst; Luca Benini; Naveen Verma

arXiv:2505.16346·cs.AR·May 26, 2025

How to keep pushing ML accelerator performance? Know your rooflines!

Marian Verhelst, Luca Benini, Naveen Verma

PDF

TL;DR

This paper surveys trends in ML hardware accelerators, introduces an enhanced roofline model to analyze their performance, and provides insights for optimizing efficiency and throughput in ML systems.

Contribution

It presents an improved roofline framework tailored for ML accelerators, unifying compute and memory interactions to guide performance optimization.

Findings

01

Enhanced roofline model effectively characterizes ML accelerator performance

02

Examples demonstrate how to identify bottlenecks and optimize designs

03

Framework reveals open research opportunities for further improvements

Abstract

The rapidly growing importance of Machine Learning (ML) applications, coupled with their ever-increasing model size and inference energy footprint, has created a strong need for specialized ML hardware architectures. Numerous ML accelerators have been explored and implemented, primarily to increase task-level throughput per unit area and reduce task-level energy consumption. This paper surveys key trends toward these objectives for more efficient ML accelerators and provides a unifying framework to understand how compute and memory technologies/architectures interact to enhance system-level efficiency and performance. To achieve this, the paper introduces an enhanced version of the roofline model and applies it to ML accelerators as an effective tool for understanding where various execution regimes fall within roofline bounds and how to maximize performance and efficiency under the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.