MLCommons Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces

Srinivas Sridharan; Theodor-Adrian Badea; Andy Balogh; Bradford M. Beckmann; Brian Coutinho; Louis Feng; Sheng Fu; Sanshan Gao; Mehryar Garakani; Taekyung Heo; David Kanter; Josh Ladd; Ziwei Li; Winston Liu; Changhai Man; Dan Mihailescu; Spandan More; Joongun Park; Ashwin Ramachandran; Vinay Ramakrishnaiah; Saeed Rashidi; Vijay Janapa Reddi; Puneet Sharma; Phio Tian; William Won; Hanjiang Wu; Huan Xu; Jinsun Yoo; and Tushar Krishna

arXiv:2605.11333·cs.DC·May 20, 2026

MLCommons Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces

Srinivas Sridharan, Theodor-Adrian Badea, Andy Balogh, Bradford M. Beckmann, Brian Coutinho, Louis Feng, Sheng Fu, Sanshan Gao, Mehryar Garakani, Taekyung Heo, David Kanter, Josh Ladd, Ziwei Li, Winston Liu, Changhai Man, Dan Mihailescu, Spandan More, Joongun Park

PDF

TL;DR

Chakra is an open ecosystem that uses standardized execution traces to improve performance benchmarking and co-design in AI systems, enabling better observation, reproduction, and optimization of distributed ML workloads.

Contribution

The paper introduces Chakra, a portable framework with a graph-based execution trace format for performance analysis and co-design of AI/ML workloads across diverse tools and platforms.

Findings

01

Chakra ETs effectively represent key operations and dependencies in distributed AI workloads.

02

Real-world case studies demonstrate Chakra's utility in optimizing AI system performance.

03

Industry adoption includes major companies like NVIDIA, AMD, Meta, and others.

Abstract

The fast pace of artificial intelligence~(AI) innovation demands an agile methodology for observation, reproduction and optimization of distributed machine learning~(ML) workload behavior in production AI systems and enables efficient software-hardware~(SW-HW) co-design for future systems. We present Chakra, an open and portable ecosystem for performance benchmarking and co-design. The core component of Chakra is an open and interoperable graph-based representation of distributed AI/ML workloads, called Chakra execution trace~(ET). These ETs represent key operations, such as compute, memory, and communication, data and control dependencies, timing, and resource constraints. Additionally, Chakra includes a complementary set of tools and capabilities to enable the collection, analysis, generation, and adoption of Chakra ETs by a broad range of simulators, emulators, and replay tools. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.