Factored Contextual Policy Search with Bayesian Optimization
Peter Karkus, Andras Kupcsik, David Hsu, Wee Sun Lee

TL;DR
This paper introduces a factored approach to contextual policy search using Bayesian optimization, improving data efficiency by separating target and environment contexts, leading to faster policy generalization in robot learning.
Contribution
It proposes a novel factored CPS method that leverages the structure of contexts to enhance data efficiency and generalization in policy learning.
Findings
Faster policy generalization in simulated experiments
Effective separation of context components improves learning efficiency
Applicable to both passive and active learning settings
Abstract
Scarce data is a major challenge to scaling robot learning to truly complex tasks, as we need to generalize locally learned policies over different "contexts". Bayesian optimization approaches to contextual policy search (CPS) offer data-efficient policy learning that generalize over a context space. We propose to improve data-efficiency by factoring typically considered contexts into two components: target-type contexts that correspond to a desired outcome of the learned behavior, e.g. target position for throwing a ball; and environment type contexts that correspond to some state of the environment, e.g. initial ball position or wind speed. Our key observation is that experience can be directly generalized over target-type contexts. Based on that we introduce Factored Contextual Policy Search with Bayesian Optimization for both passive and active learning settings. Preliminary results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning
