The Surprising Difficulty of Search in Model-Based Reinforcement Learning
Wei-Di Chang, Mikael Henaff, Brandon Amos, Gregory Dudek, Scott Fujimoto

TL;DR
This paper reveals that in model-based reinforcement learning, search can sometimes impair performance despite accurate models, emphasizing the importance of mitigating distribution shift over model accuracy.
Contribution
It challenges the conventional view by demonstrating the limited effectiveness of search as a replacement for learned policies and highlights techniques to mitigate distribution shift for better results.
Findings
Search can harm performance even with accurate models
Mitigating distribution shift is more crucial than model accuracy
Achieved state-of-the-art results on benchmark domains
Abstract
This paper investigates search in model-based reinforcement learning (RL). Conventional wisdom holds that long-term predictions and compounding errors are the primary obstacles for model-based RL. We challenge this view, showing that search is not a plug-and-play replacement for a learned policy. Surprisingly, we find that search can harm performance even when the model is highly accurate. Instead, we show that mitigating distribution shift matters more than improving model or value function accuracy. Building on this insight, we identify key techniques for enabling effective search, achieving state-of-the-art performance across multiple popular benchmark domains.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Generative Adversarial Networks and Image Synthesis
