Teach Biped Robots to Walk via Gait Principles and Reinforcement Learning with Adversarial Critics
Kuangen Zhang, Zhimin Hou, Clarence W. de Silva, Haoyong Yu, and, Chenglong Fu

TL;DR
This paper introduces a novel reinforcement learning approach with gait principles and adversarial critics to improve biped robot walking stability and efficiency, demonstrating significant reward and similarity improvements in simulation.
Contribution
It proposes a gait reward based on walking principles and an ATD3_RNN algorithm to enhance reinforcement learning for biped robot walking.
Findings
Test rewards increased by up to 23.50% and 9.63% with gait reward.
Further reward improvements of 15.96% and 12.68% with ATD3_RNN.
Cumulative reward estimation error reduced from 19.86% to 3.35%.
Abstract
Controlling a biped robot to walk stably is a challenging task considering its nonlinearity and hybrid dynamics. Reinforcement learning can address these issues by directly mapping the observed states to optimal actions that maximize the cumulative reward. However, the local minima caused by unsuitable rewards and the overestimation of the cumulative reward impede the maximization of the cumulative reward. To increase the cumulative reward, this paper designs a gait reward based on walking principles, which compensates the local minima for unnatural motions. Besides, an Adversarial Twin Delayed Deep Deterministic (ATD3) policy gradient algorithm with a recurrent neural network (RNN) is proposed to further boost the cumulative reward by mitigating the overestimation of the cumulative reward. Experimental results in the Roboschool Walker2d and Webots Atlas simulators indicate that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Locomotion and Control · Prosthetics and Rehabilitation Robotics · Reinforcement Learning in Robotics
MethodsTest
