FPGA-Accelerated RISC-V ISA Extensions for Efficient Neural Network Inference on Edge Devices
Arya Parameshwara, Santosh Hanamappa Mokashi

TL;DR
This paper introduces FPGA-accelerated RISC-V ISA extensions with neural network accelerators, achieving significant speedup and energy efficiency improvements for edge AI inference on resource-limited devices.
Contribution
It presents a novel set of RISC-V ISA extensions and FPGA implementation for neural network inference, balancing performance and programmability on edge hardware.
Findings
2.14x latency speedup over ARM Cortex-A9 baseline
49.1% energy reduction on benchmark models
Successful FPGA deployment with verified hardware performance
Abstract
Edge AI deployment faces critical challenges balancing computational performance, energy efficiency, and resource constraints. This paper presents FPGA-accelerated RISC-V instruction set architecture (ISA) extensions for efficient neural network inference on resource-constrained edge devices. We introduce a custom RISC-V core with four novel ISA extensions (FPGA.VCONV, FPGA.GEMM, FPGA.RELU, FPGA.CUSTOM) and integrated neural network accelerators, implemented and validated on the Xilinx PYNQ-Z2 platform. The complete system achieves 2.14x average latency speedup and 49.1% energy reduction versus an ARM Cortex-A9 software baseline across four benchmark models (MobileNet V2, ResNet-18, EfficientNet Lite, YOLO Tiny). Hardware implementation closes timing with +12.793 ns worst negative slack at 50 MHz while using 0.43% LUTs and 11.4% BRAM for the base core and 38.8% DSPs when accelerators…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · Adversarial Robustness in Machine Learning
