Accelerating GenAI Workloads by Enabling RISC-V Microkernel Support in IREE

Adeel Ahmad; Ahmad Tameem Kamal; Nouman Amir; Bilal Zafar; Saad Bin Nasir

arXiv:2508.14899·cs.AR·August 22, 2025

Accelerating GenAI Workloads by Enabling RISC-V Microkernel Support in IREE

Adeel Ahmad, Ahmad Tameem Kamal, Nouman Amir, Bilal Zafar, Saad Bin Nasir

PDF

Open Access

TL;DR

This paper introduces support for RISC-V microkernels in IREE, enhancing MLIR-based compilation for machine learning workloads, and demonstrates performance improvements over existing solutions.

Contribution

It enables RISC-V microkernel support in IREE by lowering MLIR dialects and developing optimized microkernels, a novel integration for accelerating ML workloads on RISC-V.

Findings

01

Performance gains over upstream IREE

02

Improved efficiency for Llama-3.2-1B-Instruct model

03

Enhanced RISC-V microkernel capabilities

Abstract

This project enables RISC-V microkernel support in IREE, an MLIR-based machine learning compiler and runtime. The approach begins by enabling the lowering of MLIR linalg dialect contraction ops to linalg.mmt4d op for the RISC-V64 target within the IREE pass pipeline, followed by the development of optimized microkernels for RISC-V. The performance gains are compared with upstream IREE and Llama.cpp for the Llama-3.2-1B-Instruct model.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · Security and Verification in Computing