Performance and Power: Systematic Evaluation of AI Workloads on   Accelerators with CARAML

Chelsea Maria John; Stepan Nassyr; Carolin Penke; Andreas Herten

arXiv:2409.12994·cs.AR·March 14, 2025

Performance and Power: Systematic Evaluation of AI Workloads on Accelerators with CARAML

Chelsea Maria John, Stepan Nassyr, Carolin Penke, Andreas Herten

PDF

Open Access 1 Repo

TL;DR

This paper presents CARAML, a comprehensive benchmark suite for evaluating the performance and energy efficiency of AI workloads on diverse hardware accelerators, aiding in systematic hardware comparison.

Contribution

Introduction of CARAML, a novel, extensible benchmark suite for assessing performance and power consumption of ML workloads on various accelerators.

Findings

01

CARAML enables consistent performance evaluation across hardware.

02

The framework supports automated and reproducible benchmarking.

03

Initial results highlight differences in energy efficiency among accelerators.

Abstract

The rapid advancement of machine learning (ML) technologies has driven the development of specialized hardware accelerators designed to facilitate more efficient model training. This paper introduces the CARAML benchmark suite, which is employed to assess performance and energy consumption during the training of transformer-based large language models and computer vision models on a range of hardware accelerators, including systems from NVIDIA, AMD, and Graphcore. CARAML provides a compact, automated, extensible, and reproducible framework for assessing the performance and energy of ML workloads across various novel hardware architectures. The design and implementation of CARAML, along with a custom power measurement tool called jpwr, are discussed in detail.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

FZJ-JSC/CARAML
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Advanced Neural Network Applications