Evaluating model-based planning and planner amortization for continuous control
Arunkumar Byravan, Leonard Hasenclever, Piotr Trochim, Mehdi Mirza,, Alessandro Davide Ialongo, Yuval Tassa, Jost Tobias Springenberg, Abbas, Abdolmaleki, Nicolas Heess, Josh Merel, Martin Riedmiller

TL;DR
This paper evaluates the effectiveness of model-based control combined with learned models and policies in continuous control tasks, demonstrating improved data efficiency and the potential for distilling planning into policies.
Contribution
It introduces a hybrid approach combining MPC with learned models and policies, and shows how to distill planning into policies without performance loss.
Findings
MPC with learned proposals improves performance in multi-task settings
Well-tuned model-free agents are strong baselines for high DoF control
Planning can be distilled into policies to amortize computation
Abstract
There is a widespread intuition that model-based control methods should be able to surpass the data efficiency of model-free approaches. In this paper we attempt to evaluate this intuition on various challenging locomotion tasks. We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning; the learned policy serves as a proposal for MPC. We find that well-tuned model-free agents are strong baselines even for high DoF control problems but MPC with learned proposals and models (trained on the fly or transferred from related tasks) can significantly improve performance and data efficiency in hard multi-task/multi-goal settings. Finally, we show that it is possible to distil a model-based planner into a policy that amortizes the planning computation without any loss of performance. Videos of agents performing different tasks can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAI-based Problem Solving and Planning · Robotic Path Planning Algorithms · Reinforcement Learning in Robotics
