MAction-SocialNav: Multi-Action Socially Compliant Navigation via Reasoning-enhanced Prompt Tuning
Zishuo Wang, Xinyu Zhang, Zhuonan Liu, Tomohito Kawabata, Daeun Song, Xuesu Xiao, Ling Xiao

TL;DR
This paper introduces MAction-SocialNav, a vision-language model that generates multiple socially compliant navigation actions, improving reasoning and safety in ambiguous social scenarios for robots.
Contribution
The work presents a novel multi-action socially compliant navigation model with reasoning-enhanced prompt tuning and a new dataset for diverse social navigation scenarios.
Findings
Outperforms GPT-4o and Claude in decision quality and safety alignment.
Achieves real-time performance at over 3x faster speed.
Demonstrates strong social reasoning and efficiency in complex environments.
Abstract
Socially compliant navigation requires robots to move safely and appropriately in human-centered environments by respecting social norms. However, social norms are often ambiguous, and in a single scenario, multiple actions may be equally acceptable. Most existing methods simplify this problem by assuming a single correct action, which limits their ability to handle real-world social uncertainty. In this work, we propose MAction-SocialNav, an efficient vision language model for socially compliant navigation that explicitly addresses action ambiguity, enabling generating multiple plausible actions within one scenario. To enhance the model's reasoning capability, we introduce a novel meta-cognitive prompt (MCP) method. Furthermore, to evaluate the proposed method, we curate a multi-action socially compliant navigation dataset that accounts for diverse conditions, including crowd density,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Social Robot Interaction and HRI · Robotics and Sensor-Based Localization
