Benchmarking Deep Reinforcement Learning Algorithms for Vision-based Robotics
Swagat Kumar, Hayden Sampson, Ardhendu Behera

TL;DR
This paper benchmarks several state-of-the-art deep reinforcement learning algorithms on challenging vision-based robotics tasks, introducing strategies for HER implementation and attention-based feature extraction, providing valuable insights into their comparative performance.
Contribution
It presents the first comprehensive benchmarking of RL algorithms on two complex vision-based robotics environments, including novel strategies for HER and attention mechanisms.
Findings
HER strategies improve learning in single-goal environments.
Attention mechanisms enhance feature extraction from RGB images.
Benchmark results highlight the strengths and limitations of each algorithm.
Abstract
This paper presents a benchmarking study of some of the state-of-the-art reinforcement learning algorithms used for solving two simulated vision-based robotics problems. The algorithms considered in this study include soft actor-critic (SAC), proximal policy optimization (PPO), interpolated policy gradients (IPG), and their variants with Hindsight Experience replay (HER). The performances of these algorithms are compared against PyBullet's two simulation environments known as KukaDiverseObjectEnv and RacecarZEDGymEnv respectively. The state observations in these environments are available in the form of RGB images and the action space is continuous, making them difficult to solve. A number of strategies are suggested to provide intermediate hindsight goals required for implementing HER algorithm on these problems which are essentially single-goal environments. In addition, a number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control
MethodsExperience Replay
