Bipedal Walking Robot using Deep Deterministic Policy Gradient

Arun Kumar; Navneet Paul; S N Omkar

arXiv:1807.05924·cs.RO·July 18, 2018·20 cites

Bipedal Walking Robot using Deep Deterministic Policy Gradient

Arun Kumar, Navneet Paul, S N Omkar

PDF

Open Access 3 Repos

TL;DR

This paper presents a reinforcement learning approach using Deep Deterministic Policy Gradient to enable a simulated bipedal robot to learn stable walking and running behaviors autonomously, with gait patterns similar to humans.

Contribution

It introduces a novel application of DDPG for training a bipedal robot in simulation, achieving human-like gait patterns without prior knowledge of dynamics.

Findings

01

Robot achieved an average speed of 0.83 m/s.

02

Gait patterns closely resembled human walking.

03

Successful autonomous learning through trial and error.

Abstract

Machine learning algorithms have found several applications in the field of robotics and control systems. The control systems community has started to show interest towards several machine learning algorithms from the sub-domains such as supervised learning, imitation learning and reinforcement learning to achieve autonomous control and intelligent decision making. Amongst many complex control problems, stable bipedal walking has been the most challenging problem. In this paper, we present an architecture to design and simulate a planar bipedal walking robot(BWR) using a realistic robotics simulator, Gazebo. The robot demonstrates successful walking behaviour by learning through several of its trial and errors, without any prior knowledge of itself or the world dynamics. The autonomous walking of the BWR is achieved using reinforcement learning algorithm called Deep Deterministic Policy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotic Locomotion and Control · Reinforcement Learning in Robotics · Robotic Path Planning Algorithms

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Experience Replay · Dense Connections · Weight Decay · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Convolution · Batch Normalization · Deep Deterministic Policy Gradient