Modeling the Mental World for Embodied AI: A Comprehensive Review
Biyuan Liu, Daigang Xu, Lei Jiang, Wenjun Guo, Ping Chen

TL;DR
This comprehensive review of Mental World Models (MWM) in Embodied AI synthesizes over 100 studies, establishing a theoretical framework, key components, reasoning paradigms, and evaluation benchmarks to advance social intelligence in embodied agents.
Contribution
It constructs the first complete theoretical framework for MWM, clarifies its components, reasoning paradigms, and evaluation benchmarks, addressing key research bottlenecks.
Findings
Established a clear distinction between MWM and PWM.
Defined key components and paradigms for MWM.
Analyzed 19 ToM reasoning methods and 26 evaluation benchmarks.
Abstract
As the application of Embodied AI Agents in avatars, wearable devices, and robotic systems continues to deepen, their core research challenges have gradually shifted from physical environment interaction to the accurate understanding of social interactions. Traditional physical world models (PWM) focus on quantifiable physical attributes such as space and motion, failing to meet the needs of social intelligence modeling. In contrast, the Mental World Model (MWM), as a structured representation of humans' internal mental states, has become the critical cognitive foundation for embodied agents to achieve natural human-machine collaboration and dynamic social adaptation. However, current MWM research faces significant bottlenecks: such as fragmented conceptual framework with vague boundaries between MWM and PWM, disjointed reasoning mechanisms for the technical pathways and applicable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSocial Robot Interaction and HRI · Action Observation and Synchronization · Embodied and Extended Cognition
