Efficient Model-Based Reinforcement Learning for Robot Control via Online Optimization
Fang Nan, Hao Ma, Qinghua Guan, Josie Hughes, Michael Muehlebach, Marco Hutter

TL;DR
This paper introduces an online model-based reinforcement learning algorithm that enables efficient, real-time control of complex robots directly in the real world, reducing sample complexity and improving adaptability.
Contribution
The method combines online dynamics modeling with policy optimization, providing formal performance guarantees and demonstrating strong sample efficiency in robotic experiments.
Findings
Achieves comparable control performance within hours on real robots.
Demonstrates robustness to changing payload conditions.
Reduces reliance on extensive offline simulation data.
Abstract
We present an online model-based reinforcement learning algorithm suitable for controlling complex robotic systems directly in the real world. Unlike prevailing sim-to-real pipelines that rely on extensive offline simulation and model-free policy optimization, our method builds a dynamics model from real-time interaction data and performs policy updates guided by the learned dynamics model. This efficient model-based reinforcement learning scheme significantly reduces the number of samples to train control policies, enabling direct training on real-world rollout data. This significantly reduces the influence of bias in the simulated data, and facilitates the search for high-performance control policies. We adopt online optimization analysis to derive sublinear regret bounds under stochastic online optimization assumptions, providing formal guarantees on performance improvement as more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
