MaxEVA: Maximizing the Efficiency of Matrix Multiplication on Versal AI Engine
Endri Taka, Aman Arora, Kai-Chiang Wu, and Diana Marculescu

TL;DR
MaxEVA is a framework that significantly improves the performance and energy efficiency of matrix multiplication workloads on AMD/Xilinx Versal AI Engine hardware, enabling faster and more energy-efficient deep learning computations.
Contribution
It introduces a novel framework, MaxEVA, that optimally maps matrix multiplication on Versal AIE devices, outperforming existing methods in throughput and energy efficiency.
Findings
Achieves up to 5.44 TFLOPs for fp32 on Versal AIE.
Attains 77.01 TOPs for int8 precision.
Provides up to 20.4% higher energy efficiency than state-of-the-art approaches.
Abstract
The increasing computational and memory requirements of Deep Learning (DL) workloads has led to outstanding innovations in hardware architectures. An archetype of such architectures is the novel Versal AI Engine (AIE) by AMD/Xilinx. The AIE comprises multiple programmable processors optimized for vector-based algorithms. An AIE array consisting of 400 processor cores, operating at 1.25 GHz is able to deliver a peak throughput of 8 TFLOPs for 32-bit floating-point (fp32), and 128 TOPs for 8-bit integer (int8) precision. In this work, we propose MaxEVA: a novel framework to efficiently map Matrix Multiplication (MatMul) workloads on Versal AIE devices. Our framework maximizes the performance and energy efficiency of MatMul applications by efficiently exploiting features of the AIE architecture and resolving performance bottlenecks from multiple angles. When demonstrating on the VC1902…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Ferroelectric and Negative Capacitance Devices · Interconnection Networks and Systems
