# Learning agile and dynamic motor skills for legged robots

**Authors:** Jemin Hwangbo, Joonho Lee, Alexey Dosovitskiy, Dario Bellicoso,, Vassilios Tsounis, Vladlen Koltun, and Marco Hutter

arXiv: 1901.08652 · 2019-01-28

## TL;DR

This paper presents a reinforcement learning approach for training neural network policies in simulation and transferring them to real legged robots, enabling agile, energy-efficient locomotion and recovery in complex scenarios.

## Contribution

It introduces a simulation-to-real transfer method for reinforcement learning policies on legged robots, demonstrating advanced locomotion capabilities on the ANYmal system.

## Key findings

- Achieved precise and energy-efficient following of high-level velocity commands.
- Enabled the robot to run faster than previous methods.
- Demonstrated recovery from falls in complex configurations.

## Abstract

Legged robots pose one of the greatest challenges in robotics. Dynamic and agile maneuvers of animals cannot be imitated by existing methods that are crafted by humans. A compelling alternative is reinforcement learning, which requires minimal craftsmanship and promotes the natural evolution of a control policy. However, so far, reinforcement learning research for legged robots is mainly limited to simulation, and only few and comparably simple examples have been deployed on real systems. The primary reason is that training with real robots, particularly with dynamically balancing systems, is complicated and expensive. In the present work, we introduce a method for training a neural network policy in simulation and transferring it to a state-of-the-art legged system, thereby leveraging fast, automated, and cost-effective data generation schemes. The approach is applied to the ANYmal robot, a sophisticated medium-dog-sized quadrupedal system. Using policies trained in simulation, the quadrupedal machine achieves locomotion skills that go beyond what had been achieved with prior methods: ANYmal is capable of precisely and energy-efficiently following high-level body velocity commands, running faster than before, and recovering from falling even in complex configurations.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.08652/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/1901.08652/full.md

## References

61 references — full list in the complete paper: https://tomesphere.com/paper/1901.08652/full.md

---
Source: https://tomesphere.com/paper/1901.08652