Cautious Bayesian Optimization for Efficient and Scalable Policy Search
Lukas P. Fr\"ohlich, Melanie N. Zeilinger, Edgar D. Klenske

TL;DR
This paper introduces a cautious Bayesian Optimization method that constrains the search space using the surrogate model's uncertainty, enabling efficient policy search in high-dimensional spaces and reducing system damage risk.
Contribution
The paper proposes a novel constraint on Bayesian Optimization based on the surrogate model's uncertainty, improving scalability and safety in policy search.
Findings
Effective in high-dimensional spaces (>100 dimensions)
Reduces risk of damaging the system during optimization
Demonstrates success on diverse tasks including motor skills and sim-to-real
Abstract
Sample efficiency is one of the key factors when applying policy search to real-world problems. In recent years, Bayesian Optimization (BO) has become prominent in the field of robotics due to its sample efficiency and little prior knowledge needed. However, one drawback of BO is its poor performance on high-dimensional search spaces as it focuses on global search. In the policy search setting, local optimization is typically sufficient as initial policies are often available, e.g., via meta-learning, kinesthetic demonstrations or sim-to-real approaches. In this paper, we propose to constrain the policy search space to a sublevel-set of the Bayesian surrogate model's predictive uncertainty. This simple yet effective way of constraining the policy update enables BO to scale to high-dimensional spaces (>100) as well as reduces the risk of damaging the system. We demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Gaussian Processes and Bayesian Inference · Advanced Multi-Objective Optimization Algorithms
