Bayesian Optimization for Sample-Efficient Policy Improvement in Robotic Manipulation
Adrian R\"ofer, Iman Nematollahi, Tim Welschehold, Wolfram Burgard,, Abhinav Valada

TL;DR
This paper presents BOpt-GMM, a hybrid method combining imitation learning and Bayesian optimization to efficiently improve robotic manipulation skills with minimal real-world data, demonstrated in simulations and real robots.
Contribution
Introduction of BOpt-GMM, a novel hybrid approach that enhances manipulation skills using few demonstrations and autonomous optimization, reducing data requirements.
Findings
Achieves high sample efficiency in complex manipulation tasks
Effective in both simulation and real-world experiments
Code and models are publicly available
Abstract
Sample efficient learning of manipulation skills poses a major challenge in robotics. While recent approaches demonstrate impressive advances in the type of task that can be addressed and the sensing modalities that can be incorporated, they still require large amounts of training data. Especially with regard to learning actions on robots in the real world, this poses a major problem due to the high costs associated with both demonstrations and real-world robot interactions. To address this challenge, we introduce BOpt-GMM, a hybrid approach that combines imitation learning with own experience collection. We first learn a skill model as a dynamical system encoded in a Gaussian Mixture Model from a few demonstrations. We then improve this model with Bayesian optimization building on a small number of autonomous skill executions in a sparse reward setting. We demonstrate the sample…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Robot Manipulation and Learning · Machine Learning and Algorithms
