# Self-Imitation Learning of Locomotion Movements through Termination   Curriculum

**Authors:** Amin Babadi, Kourosh Naderi, Perttu H\"am\"al\"ainen

arXiv: 1907.11842 · 2019-09-24

## TL;DR

This paper introduces a novel approach combining synthetic reference motion, reference state initialization, and a termination curriculum to significantly accelerate learning of stable locomotion in neural controllers, achieving results in hours.

## Contribution

It presents a new curriculum learning method called Termination Curriculum combined with reference state initialization for efficient locomotion learning from synthetic animations.

## Key findings

- Locomotion skills learned in a few hours on a standard computer.
- Synthetic cyclic reference motions enable rapid policy learning.
- The approach generalizes across different character models.

## Abstract

Animation and machine learning research have shown great advancements in the past decade, leading to robust and powerful methods for learning complex physically-based animations. However, learning can take hours or days, especially if no reference movement data is available. In this paper, we propose and evaluate a novel combination of techniques for accelerating the learning of stable locomotion movements through self-imitation learning of synthetic animations. First, we produce synthetic and cyclic reference movement using a recent online tree search approach that can discover stable walking gaits in a few minutes. This allows us to use reinforcement learning with Reference State Initialization (RSI) to find a neural network controller for imitating the synthesized reference motion. We further accelerate the learning using a novel curriculum learning approach called Termination Curriculum (TC), that adapts the episode termination threshold over time. The combination of the RSI and TC ensures that simulation budget is not wasted in regions of the state space not visited by the final policy. As a result, our agents can learn locomotion skills in just a few hours on a modest 4-core computer. We demonstrate this by producing locomotion movements for a variety of characters.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.11842/full.md

## Figures

21 figures with captions in the complete paper: https://tomesphere.com/paper/1907.11842/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/1907.11842/full.md

---
Source: https://tomesphere.com/paper/1907.11842