The detour problem in a stochastic environment: Tolman revisited

Pegah Fakhari; Arash Khodadadi; Jerome Busemeyer

arXiv:1709.09761·stat.ML·September 29, 2017

The detour problem in a stochastic environment: Tolman revisited

Pegah Fakhari, Arash Khodadadi, Jerome Busemeyer

PDF

TL;DR

This study investigates human planning and re-planning in a stochastic grid world, showing that people can learn environment dynamics and adapt their plans when paths are blocked, with model-based reinforcement learning best explaining behavior.

Contribution

It introduces a novel grid world task to study human re-planning in stochastic environments and compares multiple models, highlighting the effectiveness of model-based reinforcement learning.

Findings

01

Participants learned to plan optimally in the environment.

02

People successfully revised plans when paths were blocked.

03

Model-based reinforcement learning best explained re-planning behavior.

Abstract

We designed a grid world task to study human planning and re-planning behavior in an unknown stochastic environment. In our grid world, participants were asked to travel from a random starting point to a random goal position while maximizing their reward. Because they were not familiar with the environment, they needed to learn its characteristics from experience to plan optimally. Later in the task, we randomly blocked the optimal path to investigate whether and how people adjust their original plans to find a detour. To this end, we developed and compared 12 different models. These models were different on how they learned and represented the environment and how they planned to catch the goal. The majority of our participants were able to plan optimally. We also showed that people were capable of revising their plans when an unexpected event occurred. The result from the model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.