Policy Transfer with Strategy Optimization

Wenhao Yu; C. Karen Liu; Greg Turk

arXiv:1810.05751·cs.LG·December 5, 2018·47 cites

Policy Transfer with Strategy Optimization

Wenhao Yu, C. Karen Liu, Greg Turk

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel transfer learning method for robotic control policies that learns a family of policies in simulation and searches for the best one in the real environment, bypassing the need for precise system identification.

Contribution

It proposes a strategy optimization approach that leverages domain randomization to transfer control policies to unknown environments without explicit system identification.

Findings

01

Outperforms robust and adaptive policies in large modeling errors

02

Effective across five different simulated robotic tasks

03

Enables transfer to environments with significant discrepancies

Abstract

Computer simulation provides an automatic and safe way for training robotic control policies to achieve complex tasks such as locomotion. However, a policy trained in simulation usually does not transfer directly to the real hardware due to the differences between the two environments. Transfer learning using domain randomization is a promising approach, but it usually assumes that the target environment is close to the distribution of the training environments, thus relying heavily on accurate system identification. In this paper, we present a different approach that leverages domain randomization for transferring control policies to unknown environments. The key idea that, instead of learning a single policy in the simulation, we simultaneously learn a family of policies that exhibit different behaviors. When tested in the target environment, we directly search for the best policy in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vincentyu68/policy_transfer
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Fuel Cells and Related Materials