Evaluating Generative Models as Interactive Emergent Representations of Human-Like Collaborative Behavior

Shinas Shaji; Teena Chakkalayil Hassan; Sebastian Houben; Alex Mitrevski

arXiv:2605.03855·cs.RO·May 7, 2026

Evaluating Generative Models as Interactive Emergent Representations of Human-Like Collaborative Behavior

Shinas Shaji, Teena Chakkalayil Hassan, Sebastian Houben, Alex Mitrevski

PDF

TL;DR

This study investigates whether embodied foundation model agents demonstrate emergent collaborative behaviors indicative of mental models, using a 2D game environment, automated behavior detection, and human user evaluations.

Contribution

It introduces an experimental framework, empirical evidence of emergent collaborative behaviors in embodied LLM agents, and a validated behavioral analysis methodology.

Findings

01

Foundation models exhibit collaborative behaviors without explicit training.

02

Behavioral patterns vary across different LLMs during collaboration.

03

Human users report positive experiences and perceived effectiveness.

Abstract

Human-AI collaboration requires AI agents to understand human behavior for effective coordination. While advances in foundation models show promising capabilities in understanding and showing human-like behavior, their application in embodied collaborative settings needs further investigation. This work examines whether embodied foundation model agents exhibit emergent collaborative behaviors indicating underlying mental models of their collaborators, which is an important aspect of effective coordination. This paper develops a 2D collaborative game environment where large language model agents and humans complete color-matching tasks requiring coordination. We define five collaborative behaviors as indicators of emergent mental model representation: perspective-taking, collaborator-aware planning, introspection, theory of mind, and clarification. An automated behavior detection system…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.