Expedient Assistance and Consequential Misunderstanding: Envisioning an Operationalized Mutual Theory of Mind
Justin D. Weisz, Michael Muller, Arielle Goldberg, Dario Andres Silva, Moran

TL;DR
This paper uses design fictions to explore the potential benefits and risks of operationalizing a mutual theory of mind between humans and AI, highlighting both positive collaboration and possible misunderstandings.
Contribution
It introduces a novel approach using design fictions to examine the implications of implementing a mutual theory of mind in human-AI interactions.
Findings
MToM can improve human-AI collaboration when well-aligned.
Misaligned MToM models can cause misunderstandings and breakdowns.
Design fictions reveal both optimistic and cautionary scenarios for MToM.
Abstract
Design fictions allow us to prototype the future. They enable us to interrogate emerging or non-existent technologies and examine their implications. We present three design fictions that probe the potential consequences of operationalizing a mutual theory of mind (MToM) between human users and one (or more) AI agents. We use these fictions to explore many aspects of MToM, including how models of the other party are shaped through interaction, how discrepancies between these models lead to breakdowns, and how models of a human's knowledge and skills enable AI agents to act in their stead. We examine these aspects through two lenses: a utopian lens in which MToM enhances human-human interactions and leads to synergistic human-AI collaborations, and a dystopian lens in which a faulty or misaligned MToM leads to problematic outcomes. Our work provides an aspirational vision for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPsychotherapy Techniques and Applications
