What Matters for Simulation to Online Reinforcement Learning on Real Robots
Yarden As, Dhruva Tirumala, Ren\'e Zurbr\"ugg, Chenhao Li, Stelian Coros, Andreas Krause, Markus Wulfmeier

TL;DR
This paper systematically studies the design choices that enable successful online reinforcement learning on real robots, providing empirical insights to improve deployment stability and reduce engineering effort.
Contribution
It offers the first large-sample empirical analysis of algorithmic and system design choices for online RL on physical robots, identifying effective practices.
Findings
Some common defaults can be harmful
Robust design choices lead to stable learning
Empirical results across multiple robots and tasks
Abstract
We investigate what specific design choices enable successful online reinforcement learning (RL) on physical robots. Across 100 real-world training runs on three distinct robotic platforms, we systematically ablate algorithmic, systems, and experimental decisions that are typically left implicit in prior work. We find that some widely used defaults can be harmful, while a set of robust, readily adopted design choices within standard RL practice yield stable learning across tasks and hardware. These results provide the first large-sample empirical study of such design choices, enabling practitioners to deploy online RL with lower engineering effort.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Advanced Bandit Algorithms Research
