Learning Quantized Continuous Controllers for Integer Hardware

Fabian Kresse; Christoph H. Lampert

arXiv:2511.07046·cs.LG·November 18, 2025

Learning Quantized Continuous Controllers for Integer Hardware

Fabian Kresse, Christoph H. Lampert

PDF

Open Access

TL;DR

This paper presents a method for training low-bit quantized policies for reinforcement learning that can be efficiently implemented on embedded FPGA hardware, achieving low latency and power consumption while maintaining performance.

Contribution

It introduces a learning-to-hardware pipeline for automatic selection and synthesis of low-bit policies for integer inference on FPGAs, demonstrating competitive results with minimal bit-widths.

Findings

01

Policies with as few as 2-3 bits per weight perform comparably to full precision.

02

Quantized policies achieve microsecond inference latency and microjoule energy per action.

03

Quantized policies show increased robustness to input noise.

Abstract

Deploying continuous-control reinforcement learning policies on embedded hardware requires meeting tight latency and power budgets. Small FPGAs can deliver these, but only if costly floating point pipelines are avoided. We study quantization-aware training (QAT) of policies for integer inference and we present a learning-to-hardware pipeline that automatically selects low-bit policies and synthesizes them to an Artix-7 FPGA. Across five MuJoCo tasks, we obtain policy networks that are competitive with full precision (FP32) policies but require as few as 3 or even only 2 bits per weight, and per internal activation value, as long as input precision is chosen carefully. On the target hardware, the selected policies achieve inference latencies on the order of microseconds and consume microjoules per action, favorably comparing to a quantized reference. Last, we observe that the quantized…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Neural Network Applications · Embedded Systems Design Techniques