Adding MFMA Support to gem5

Marco Kurzynski; Matthew D. Sinclair

arXiv:2501.18113·cs.AR·February 4, 2025

Adding MFMA Support to gem5

Marco Kurzynski, Matthew D. Sinclair

PDF

Open Access

TL;DR

This paper enhances the gem5 simulator by integrating support for Matrix Core Engines (MCEs) on AMD GPUs, enabling detailed simulation of MFMA instructions for machine learning workloads and future system analysis.

Contribution

The work introduces support for MCEs in gem5's GPU model, allowing simulation of MFMA instructions on AMD MI200 and MI300 GPUs, which was not previously possible.

Findings

01

Enables simulation of ML workloads in gem5 with MCE support

02

Allows analysis of MCE optimizations on system behavior

03

Supports future GPU architecture research

Abstract

In this work we have enhanced gem5's GPU model support to add Matrix Core Engines (MCEs). Specifically, on the AMD MI200 and MI300 GPUs that gem5 supports, these MCEs perform Matrix Fused Multiply Add (MFMA) instructions for a variety of precisions. By adding this support, our changes enable running state-of-the-art ML workloads in gem5, as well as examining how MCE optimizations impact the behavior of future systems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSatellite Communication Systems