Experience-driven discovery of planning strategies

Ruiqi He; Falk Lieder

arXiv:2412.03111·cs.AI·December 5, 2024

Experience-driven discovery of planning strategies

Ruiqi He, Falk Lieder

PDF

Open Access

TL;DR

This paper explores how humans discover new planning strategies through metacognitive reinforcement learning, introducing models that better explain human strategy discovery but still lag in discovery speed.

Contribution

It proposes that metacognitive reinforcement learning underpins the discovery of new planning strategies and provides models that outperform alternative mechanisms in explaining this process.

Findings

01

Models demonstrate capability for strategy discovery.

02

Models better explain human strategy discovery.

03

Humans discover strategies faster than models.

Abstract

One explanation for how people can plan efficiently despite limited cognitive resources is that we possess a set of adaptive planning strategies and know when and how to use them. But how are these strategies acquired? While previous research has studied how individuals learn to choose among existing strategies, little is known about the process of forming new planning strategies. In this work, we propose that new planning strategies are discovered through metacognitive reinforcement learning. To test this, we designed a novel experiment to investigate the discovery of new planning strategies. We then present metacognitive reinforcement learning models and demonstrate their capability for strategy discovery as well as show that they provide a better explanation of human strategy discovery than alternative learning mechanisms. However, when fitted to human data, these models exhibit a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplex Systems and Decision Making

MethodsSparse Evolutionary Training