Off-Policy Deep Reinforcement Learning Algorithms for Handling Various   Robotic Manipulator Tasks

Altun Rzayev; Vahid Tavakol Aghaei

arXiv:2212.05572·cs.RO·December 13, 2022·1 cites

Off-Policy Deep Reinforcement Learning Algorithms for Handling Various Robotic Manipulator Tasks

Altun Rzayev, Vahid Tavakol Aghaei

PDF

Open Access

TL;DR

This paper compares the efficiency and speed of three off-policy deep reinforcement learning algorithms—DDPG, TD3, and SAC—in training a robotic manipulator across four tasks in a simulated environment, highlighting their advantages over traditional control methods.

Contribution

It provides a comparative analysis of DDPG, TD3, and SAC algorithms for robotic manipulation tasks, demonstrating their effectiveness and efficiency in simulation.

Findings

01

All three algorithms successfully trained the manipulator.

02

SAC showed the fastest convergence among the three.

03

Off-policy algorithms outperform traditional control methods in speed and data efficiency.

Abstract

In order to avoid conventional controlling methods which created obstacles due to the complexity of systems and intense demand on data density, developing modern and more efficient control methods are required. In this way, reinforcement learning off-policy and model-free algorithms help to avoid working with complex models. In terms of speed and accuracy, they become prominent methods because the algorithms use their past experience to learn the optimal policies. In this study, three reinforcement learning algorithms; DDPG, TD3 and SAC have been used to train Fetch robotic manipulator for four different tasks in MuJoCo simulation environment. All of these algorithms are off-policy and able to achieve their desired target by optimizing both policy and value functions. In the current study, the efficiency and the speed of these three algorithms are analyzed in a controlled environment.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Software Engineering Methodologies

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Dilated Convolution · 1x1 Convolution · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Average Pooling · Dense Connections · Weight Decay · Convolution · Deep Deterministic Policy Gradient