Policy Transfer with Strategy Optimization
Wenhao Yu, C. Karen Liu, Greg Turk

TL;DR
This paper introduces a novel transfer learning method for robotic control policies that learns a family of policies in simulation and searches for the best one in the real environment, bypassing the need for precise system identification.
Contribution
It proposes a strategy optimization approach that leverages domain randomization to transfer control policies to unknown environments without explicit system identification.
Findings
Outperforms robust and adaptive policies in large modeling errors
Effective across five different simulated robotic tasks
Enables transfer to environments with significant discrepancies
Abstract
Computer simulation provides an automatic and safe way for training robotic control policies to achieve complex tasks such as locomotion. However, a policy trained in simulation usually does not transfer directly to the real hardware due to the differences between the two environments. Transfer learning using domain randomization is a promising approach, but it usually assumes that the target environment is close to the distribution of the training environments, thus relying heavily on accurate system identification. In this paper, we present a different approach that leverages domain randomization for transferring control policies to unknown environments. The key idea that, instead of learning a single policy in the simulation, we simultaneously learn a family of policies that exhibit different behaviors. When tested in the target environment, we directly search for the best policy in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Fuel Cells and Related Materials
