World Models Unlock Optimal Foraging Strategies in Reinforcement Learning Agents
Yesid Fonseca, Manuel S. R\'ios, Nicanor Quijano, Luis F. Giraldo

TL;DR
This paper demonstrates that reinforcement learning agents with learned world models naturally develop optimal foraging strategies aligned with the Marginal Value Theorem, highlighting the importance of predictive representations for adaptive decision-making.
Contribution
It shows that model-based reinforcement learning agents can inherently produce biologically plausible foraging behaviors aligned with ecological optimality principles.
Findings
Model-based RL agents converge to MVT-aligned strategies.
Predictive capabilities drive efficient patch-leaving decisions.
Compared to model-free agents, they exhibit more biologically realistic behaviors.
Abstract
Patch foraging involves the deliberate and planned process of determining the optimal time to depart from a resource-rich region and investigate potentially more beneficial alternatives. The Marginal Value Theorem (MVT) is frequently used to characterize this process, offering an optimality model for such foraging behaviors. Although this model has been widely used to make predictions in behavioral ecology, discovering the computational mechanisms that facilitate the emergence of optimal patch-foraging decisions in biological foragers remains under investigation. Here, we show that artificial foragers equipped with learned world models naturally converge to MVT-aligned strategies. Using a model-based reinforcement learning agent that acquires a parsimonious predictive representation of its environment, we demonstrate that anticipatory capabilities, rather than reward maximization alone,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmbodied and Extended Cognition · Diffusion and Search Dynamics · Language and cultural evolution
