Real-Time Reinforcement Learning for Vision-Based Robotics Utilizing Local and Remote Computers
Yan Wang, Gautham Vasan, A. Rupam Mahmood

TL;DR
This paper introduces ReLoD, a distributed real-time reinforcement learning system for vision-based robotics, demonstrating how computation distribution affects performance of SAC and PPO algorithms on robotic control tasks.
Contribution
The paper presents ReLoD, a novel system for distributing RL computations between local and remote computers, and analyzes its impact on algorithm performance in real-time robotic vision tasks.
Findings
SAC performance degrades on resource-limited local computers.
Distributed SAC improves performance when computations are carefully allocated.
PPO performance remains stable regardless of computation distribution.
Abstract
Real-time learning is crucial for robotic agents adapting to ever-changing, non-stationary environments. A common setup for a robotic agent is to have two different computers simultaneously: a resource-limited local computer tethered to the robot and a powerful remote computer connected wirelessly. Given such a setup, it is unclear to what extent the performance of a learning system can be affected by resource limitations and how to efficiently use the wirelessly connected powerful computer to compensate for any performance loss. In this paper, we implement a real-time learning system called the Remote-Local Distributed (ReLoD) system to distribute computations of two deep reinforcement learning (RL) algorithms, Soft Actor-Critic (SAC) and Proximal Policy Optimization (PPO), between a local and a remote computer. The performance of the system is evaluated on two vision-based control…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Distributed Control Multi-Agent Systems · Protein Degradation and Inhibitors
Methods1x1 Convolution · Average Pooling · Entropy Regularization · Global Average Pooling · Dilated Convolution · Convolution · Proximal Policy Optimization · Switchable Atrous Convolution
