Dissecting RISC-V Performance: Practical PMU Profiling and Hardware-Agnostic Roofline Analysis on Emerging Platforms
Alexander Batashev

TL;DR
This paper introduces a practical, hardware-agnostic methodology for performance profiling and Roofline analysis on RISC-V platforms, addressing hardware bugs and tooling limitations to aid developers.
Contribution
It presents a novel LLVM-based Roofline tooling and hardware bug workaround, enabling reliable performance analysis on emerging RISC-V systems.
Findings
Robust event sampling workaround for hardware bugs
Compiler-driven Roofline analysis without hardware PMUs
Open source toolchain automates profiling workflow
Abstract
As RISC-V architectures proliferate across embedded and high-performance domains, developers face persistent challenges in performance optimization due to fragmented tooling, immature hardware features, and platform-specific defects. This paper delivers a pragmatic methodology for extracting actionable performance insights on RISC-V systems, even under constrained or unreliable hardware conditions. We present a workaround to circumvent hardware bugs in one of the popular RISC-V implementations, enabling robust event sampling. For memory-compute bottleneck analysis, we introduce compiler-driven Roofline tooling that operates without hardware PMU dependencies, leveraging LLVM-based instrumentation to derive operational intensity and throughput metrics directly from application IR. Our open source toolchain automates these workarounds, unifying PMU data correction and compiler-guided…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
