Mental Modeling of Reinforcement Learning Agents by Language Models
Wenhao Lu, Xufeng Zhao, Josua Spisak, Jae Hee Lee, Stefan Wermter

TL;DR
This paper investigates the ability of large language models to build mental models of reinforcement learning agents by reasoning about their behavior, revealing current limitations and potential for explainability in AI systems.
Contribution
It introduces the concept of agent mental modeling using LLMs, proposes evaluation metrics, and empirically assesses their capabilities across different RL tasks.
Findings
LLMs struggle to fully model agents through inference alone
Evaluation metrics for agent mental modeling are proposed and tested
Results highlight current limitations of LLMs in understanding RL agent behavior
Abstract
Can emergent language models faithfully model the intelligence of decision-making agents? Though modern language models exhibit already some reasoning ability, and theoretically can potentially express any probable distribution over tokens, it remains underexplored how the world knowledge these pretrained models have memorized can be utilized to comprehend an agent's behaviour in the physical world. This study empirically examines, for the first time, how well large language models (LLMs) can build a mental model of agents, termed agent mental modelling, by reasoning about an agent's behaviour and its effect on states from agent interaction history. This research may unveil the potential of leveraging LLMs for elucidating RL agent behaviour, addressing a key challenge in eXplainable reinforcement learning (XRL). To this end, we propose specific evaluation metrics and test them on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Fuzzy Logic and Control Systems
