Active Contextual Entropy Search

Jan Hendrik Metzen

arXiv:1511.04211·stat.ML·November 17, 2015·5 cites

Active Contextual Entropy Search

Jan Hendrik Metzen

PDF

Open Access

TL;DR

This paper introduces an active contextual entropy search method that enhances Bayesian optimization for efficient policy learning in robotics by actively selecting the most informative tasks during training.

Contribution

It extends entropy search for active contextual policy search, enabling the agent to choose tasks that maximize learning efficiency during training.

Findings

01

Empirical results show reduced number of trials needed for successful learning.

02

The method outperforms non-active approaches in simulation.

03

Active task selection improves sample efficiency in robotic policy search.

Abstract

Contextual policy search allows adapting robotic movement primitives to different situations. For instance, a locomotion primitive might be adapted to different terrain inclinations or desired walking speeds. Such an adaptation is often achievable by modifying a small number of hyperparameters. However, learning, when performed on real robotic systems, is typically restricted to a small number of trials. Bayesian optimization has recently been proposed as a sample-efficient means for contextual policy search that is well suited under these conditions. In this work, we extend entropy search, a variant of Bayesian optimization, such that it can be used for active contextual policy search where the agent selects those tasks during training in which it expects to learn the most. Empirical results in simulation suggest that this allows learning successful behavior with less trials.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Gaussian Processes and Bayesian Inference · Advanced Bandit Algorithms Research