Binarized P-Network: Deep Reinforcement Learning of Robot Control from   Raw Images on FPGA

Yuki Kadokawa; Yoshihisa Tsurumine; Takamitsu Matsubara

arXiv:2109.04966·cs.RO·September 16, 2021

Binarized P-Network: Deep Reinforcement Learning of Robot Control from Raw Images on FPGA

Yuki Kadokawa, Yoshihisa Tsurumine, Takamitsu Matsubara

PDF

Open Access

TL;DR

This paper introduces Binarized P-Network, a novel deep reinforcement learning algorithm using Binarized CNNs for robot control from raw images, optimized for FPGA implementation to achieve power efficiency.

Contribution

The paper presents a new DRL algorithm, Binarized P-Network, that enables FPGA-based image control for robots using BCNNs and a robust value update scheme.

Findings

01

Effective in simulation and real-robot FPGA experiments

02

Achieves power-efficient image-based control

03

Demonstrates stable learning with Conservative Value Iteration

Abstract

This paper explores a Deep Reinforcement Learning (DRL) approach for designing image-based control for edge robots to be implemented on Field Programmable Gate Arrays (FPGAs). Although FPGAs are more power-efficient than CPUs and GPUs, a typical DRL method cannot be applied since they are composed of many Logic Blocks (LBs) for high-speed logical operations but low-speed real-number operations. To cope with this problem, we propose a novel DRL algorithm called Binarized P-Network (BPN), which learns image-input control policies using Binarized Convolutional Neural Networks (BCNNs). To alleviate the instability of reinforcement learning caused by a BCNN with low function approximation accuracy, our BPN adopts a robust value update scheme called Conservative Value Iteration, which is tolerant of function approximation errors. We confirmed the BPN's effectiveness through applications to a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Memory and Neural Computing · Adversarial Robustness in Machine Learning