DPUConfig: Optimizing ML Inference in FPGAs Using Reinforcement Learning

Alexandros Patras; Spyros Lalis; Christos D. Antonopoulos; Nikolaos Bellas

arXiv:2602.12847·cs.AR·February 16, 2026

DPUConfig: Optimizing ML Inference in FPGAs Using Reinforcement Learning

Alexandros Patras, Spyros Lalis, Christos D. Antonopoulos, Nikolaos Bellas

PDF

Open Access

TL;DR

This paper presents DPUConfig, a reinforcement learning-based framework that dynamically optimizes FPGA-based ML inference configurations, significantly improving energy efficiency for CNN models on embedded systems.

Contribution

Introduces DPUConfig, a novel RL-driven runtime system for optimizing FPGA DPU configurations based on real-time telemetry data.

Findings

01

RL agent achieves 95% of optimal energy efficiency

02

Effective dynamic configuration selection for CNN inference

03

Improved resource utilization and power management

Abstract

Heterogeneous embedded systems, with diverse computing elements and accelerators such as FPGAs, offer a promising platform for fast and flexible ML inference, which is crucial for services such as autonomous driving and augmented reality, where delays can be costly. However, efficiently allocating computational resources for deep learning applications in FPGA-based systems is a challenging task. A Deep Learning Processor Unit (DPU) is a parameterizable FPGA-based accelerator module optimized for ML inference. It supports a wide range of ML models and can be instantiated multiple times within a single FPGA to enable concurrent execution. This paper introduces DPUConfig, a novel runtime management framework, based on a custom Reinforcement Learning (RL) agent, that dynamically selects optimal DPU configurations by leveraging real-time telemetry data monitoring, system utilization, power…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Embedded Systems Design Techniques · Low-power high-performance VLSI design