Accelerating GenAI Workloads by Enabling RISC-V Microkernel Support in IREE
Adeel Ahmad, Ahmad Tameem Kamal, Nouman Amir, Bilal Zafar, Saad Bin Nasir

TL;DR
This paper introduces support for RISC-V microkernels in IREE, enhancing MLIR-based compilation for machine learning workloads, and demonstrates performance improvements over existing solutions.
Contribution
It enables RISC-V microkernel support in IREE by lowering MLIR dialects and developing optimized microkernels, a novel integration for accelerating ML workloads on RISC-V.
Findings
Performance gains over upstream IREE
Improved efficiency for Llama-3.2-1B-Instruct model
Enhanced RISC-V microkernel capabilities
Abstract
This project enables RISC-V microkernel support in IREE, an MLIR-based machine learning compiler and runtime. The approach begins by enabling the lowering of MLIR linalg dialect contraction ops to linalg.mmt4d op for the RISC-V64 target within the IREE pass pipeline, followed by the development of optimized microkernels for RISC-V. The performance gains are compared with upstream IREE and Llama.cpp for the Llama-3.2-1B-Instruct model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · Security and Verification in Computing
