TL;DR
This study demonstrates that frontier Large Reasoning Models (LRMs) closely mimic human learning and brain activity during complex gameplay, outperforming traditional reinforcement learning agents in predictive accuracy.
Contribution
It introduces a comprehensive evaluation of LRMs against human behavior and neural data, establishing LRMs as effective models of human learning in naturalistic tasks.
Findings
LRMs best match human behavioral patterns during game discovery.
LRMs predict brain activity significantly better than reinforcement learning agents.
Brain alignment with LRMs reflects in-context game state representation.
Abstract
Humans rapidly learn abstract knowledge when encountering novel environments and flexibly deploy this knowledge to guide efficient and intelligent action. Can modern AI systems learn and plan in a similar way? We study this question using a dataset of complex human gameplay with concurrent fMRI recordings, in which participants learn novel video games that require rule discovery, hypothesis revision, and multi-step planning. We jointly evaluate models by their ability to play the games, match human learning behavior, and predict brain activity during the same task, comparing a suite of frontier Large Reasoning Models (LRMs) against model-free and model-based deep reinforcement learning agents and a Bayesian theory-based agent. We find that frontier LRMs most closely match human behavioral patterns during game discovery and predict brain activity an order of magnitude better than both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
