# Sim-to-Real Transfer for Biped Locomotion

**Authors:** Wenhao Yu, Visak CV Kumar, Greg Turk, C. Karen Liu

arXiv: 1903.01390 · 2019-08-27

## TL;DR

This paper introduces a two-stage system identification and policy learning method for transferring biped locomotion control policies from simulation to real robots, improving real-world performance.

## Contribution

It proposes a novel approach combining pre- and post-sysID with a projected universal policy conditioned on a latent variable, enhancing sim-to-real transfer for biped robots.

## Key findings

- Successfully transferred three biped locomotion policies to real hardware
- Used Bayesian Optimization to fine-tune policy parameters on the robot
- Achieved effective walking in multiple directions on Darwin OP2

## Abstract

We present a new approach for transfer of dynamic robot control policies such as biped locomotion from simulation to real hardware. Key to our approach is to perform system identification of the model parameters {\mu} of the hardware (e.g. friction, center-of-mass) in two distinct stages, before policy learning (pre-sysID) and after policy learning (post-sysID). Pre-sysID begins by collecting trajectories from the physical hardware based on a set of generic motion sequences. Because the trajectories may not be related to the task of interest, presysID does not attempt to accurately identify the true value of {\mu}, but only to approximate the range of {\mu} to guide the policy learning. Next, a Projected Universal Policy (PUP) is created by simultaneously training a network that projects {\mu} to a low-dimensional latent variable {\eta} and a family of policies that are conditioned on {\eta}. The second round of system identification (post-sysID) is then carried out by deploying the PUP on the robot hardware using task-relevant trajectories. We use Bayesian Optimization to determine the values for {\eta} that optimizes the performance of PUP on the real hardware. We have used this approach to create three successful biped locomotion controllers (walk forward, walk backwards, walk sideways) on the Darwin OP2 robot.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.01390/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1903.01390/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/1903.01390/full.md

---
Source: https://tomesphere.com/paper/1903.01390