Classified Regression for Bayesian Optimization: Robot Learning with Unknown Penalties
Alonso Marco, Dominik Baumann, Philipp Hennig, Sebastian Trimpe

TL;DR
This paper introduces a Bayesian optimization approach that models unknown penalties for unstable robot controllers, guiding exploration towards stable regions and improving learning efficiency in robotic tasks.
Contribution
It proposes a Bayesian model for unknown penalties, integrated with Gaussian processes, and extends Max-Value Entropy Search to handle unknown constraints in optimization.
Findings
Model effectively predicts high costs for unstable controllers.
Guides Bayesian optimization towards stable regions.
Validated on synthetic benchmarks and real robotic platform.
Abstract
Learning robot controllers by minimizing a black-box objective cost using Bayesian optimization (BO) can be time-consuming and challenging. It is very often the case that some roll-outs result in failure behaviors, causing premature experiment detention. In such cases, the designer is forced to decide on heuristic cost penalties because the acquired data is often scarce, or not comparable with that of the stable policies. To overcome this, we propose a Bayesian model that captures exactly what we know about the cost of unstable controllers prior to data collection: Nothing, except that it should be a somewhat large number. The resulting Bayesian model, approximated with a Gaussian process, predicts high cost values in regions where failures are likely to occur. In this way, the model guides the BO exploration toward regions of stability. We demonstrate the benefits of the proposed model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Machine Learning and Algorithms · Advanced Multi-Objective Optimization Algorithms
