AToM-Bot: Embodied Fulfillment of Unspoken Human Needs with Affective Theory of Mind
Wei Ding, Fanhong Li, Ziteng Ji, Zhengrong Xue, Jia Liu

TL;DR
AToM-Bot is a proactive robot framework that infers human needs through affective theory of mind, generating and executing tasks to improve well-being without explicit commands, demonstrating high human satisfaction in diverse scenarios.
Contribution
This work introduces AToM-Bot, a novel framework combining affective theory of mind with vision-language models for autonomous need detection and task fulfillment in human-robot interaction.
Findings
High human satisfaction scores in need detection and task execution.
Effective inference of unspoken human needs in daily scenarios.
Successful generation of feasible plans to fulfill human needs.
Abstract
We propose AToM-Bot, a novel task generation and execution framework for proactive robot-human interaction, which leverages the human mental and physical state inference capabilities of the Vision Language Model (VLM) prompted by the Affective Theory of Mind (AToM). Without requiring explicit commands by humans, AToM-Bot proactively generates and follows feasible tasks to improve general human well-being. When around humans, AToM-Bot first detects current human needs based on inferred human states and observations of the surrounding environment. It then generates tasks to fulfill these needs, taking into account its embodied constraints. We designed 16 daily life scenarios spanning 4 common scenes and tasked the same visual stimulus to 59 human subjects and our robot. We used the similarity between human open-ended answers and robot output, and the human satisfaction scores to metric…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPsychiatry, Mental Health, Neuroscience
