Learning Quantized Continuous Controllers for Integer Hardware
Fabian Kresse, Christoph H. Lampert

TL;DR
This paper presents a method for training low-bit quantized policies for reinforcement learning that can be efficiently implemented on embedded FPGA hardware, achieving low latency and power consumption while maintaining performance.
Contribution
It introduces a learning-to-hardware pipeline for automatic selection and synthesis of low-bit policies for integer inference on FPGAs, demonstrating competitive results with minimal bit-widths.
Findings
Policies with as few as 2-3 bits per weight perform comparably to full precision.
Quantized policies achieve microsecond inference latency and microjoule energy per action.
Quantized policies show increased robustness to input noise.
Abstract
Deploying continuous-control reinforcement learning policies on embedded hardware requires meeting tight latency and power budgets. Small FPGAs can deliver these, but only if costly floating point pipelines are avoided. We study quantization-aware training (QAT) of policies for integer inference and we present a learning-to-hardware pipeline that automatically selects low-bit policies and synthesizes them to an Artix-7 FPGA. Across five MuJoCo tasks, we obtain policy networks that are competitive with full precision (FP32) policies but require as few as 3 or even only 2 bits per weight, and per internal activation value, as long as input precision is chosen carefully. On the target hardware, the selected policies achieve inference latencies on the order of microseconds and consume microjoules per action, favorably comparing to a quantized reference. Last, we observe that the quantized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Neural Network Applications · Embedded Systems Design Techniques
