Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control
Kendall Lowrey, Aravind Rajeswaran, Sham Kakade, Emanuel Todorov, Igor, Mordatch

TL;DR
The paper introduces POLO, a framework combining local model-based control and global value learning, enabling efficient, stable, and exploratory learning for complex control tasks with minimal real-world experience.
Contribution
It presents a novel integration of trajectory optimization, value function learning, and exploration strategies to improve sample efficiency and stability in continuous control tasks.
Findings
Enables complex control tasks with minutes of real-world experience
Improves stability and speed of value function learning
Uses trajectory optimization for coordinated exploration
Abstract
We propose a plan online and learn offline (POLO) framework for the setting where an agent, with an internal model, needs to continually act and learn in the world. Our work builds on the synergistic relationship between local model-based control, global value function learning, and exploration. We study how local trajectory optimization can cope with approximation errors in the value function, and can stabilize and accelerate value function learning. Conversely, we also study how approximate value functions can help reduce the planning horizon and allow for better policies beyond local solutions. Finally, we also demonstrate how trajectory optimization can be used to perform temporally coordinated exploration in conjunction with estimating uncertainty in value function approximation. This exploration is critical for fast and stable learning of the value function. Combining these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Model Reduction and Neural Networks · Simulation Techniques and Applications
