The Effect of Planning Shape on Dyna-style Planning in High-dimensional State Spaces
G. Zacharias Holland, Erin J. Talvitie, and Michael Bowling

TL;DR
This paper investigates how the shape of planning in Dyna-style reinforcement learning affects performance in high-dimensional environments, revealing that longer rollouts improve learning and demonstrating successful use of learned models in complex games.
Contribution
It shows that planning shape significantly influences Dyna's effectiveness and provides the first evidence of learned models successfully used for planning in high-dimensional environments like ALE.
Findings
Longer, fewer rollouts enhance Dyna's performance.
Planning shape impacts the benefit of model-based updates.
Learned models can be effectively used for planning in ALE.
Abstract
Dyna is a fundamental approach to model-based reinforcement learning (MBRL) that interleaves planning, acting, and learning in an online setting. In the most typical application of Dyna, the dynamics model is used to generate one-step transitions from selected start states from the agent's history, which are used to update the agent's value function or policy as if they were real experiences. In this work, one-step Dyna was applied to several games from the Arcade Learning Environment (ALE). We found that the model-based updates offered surprisingly little benefit over simply performing more updates with the agent's existing experience, even when using a perfect model. We hypothesize that to get the most from planning, the model must be used to generate unfamiliar experience. To test this, we experimented with the "shape" of planning in multiple different concrete instantiations of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Evolutionary Algorithms and Applications
