The detour problem in a stochastic environment: Tolman revisited
Pegah Fakhari, Arash Khodadadi, Jerome Busemeyer

TL;DR
This study investigates human planning and re-planning in a stochastic grid world, showing that people can learn environment dynamics and adapt their plans when paths are blocked, with model-based reinforcement learning best explaining behavior.
Contribution
It introduces a novel grid world task to study human re-planning in stochastic environments and compares multiple models, highlighting the effectiveness of model-based reinforcement learning.
Findings
Participants learned to plan optimally in the environment.
People successfully revised plans when paths were blocked.
Model-based reinforcement learning best explained re-planning behavior.
Abstract
We designed a grid world task to study human planning and re-planning behavior in an unknown stochastic environment. In our grid world, participants were asked to travel from a random starting point to a random goal position while maximizing their reward. Because they were not familiar with the environment, they needed to learn its characteristics from experience to plan optimally. Later in the task, we randomly blocked the optimal path to investigate whether and how people adjust their original plans to find a detour. To this end, we developed and compared 12 different models. These models were different on how they learned and represented the environment and how they planned to catch the goal. The majority of our participants were able to plan optimally. We also showed that people were capable of revising their plans when an unexpected event occurred. The result from the model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
