Safe and Interpretable Multimodal Path Planning for Multi-Agent Cooperation
Haojun Shi, Suyu Ye, Katherine M. Guerrerio, Jianzhi Shen, Yifan Yin, Daniel Khashabi, Chien-Ming Huang, Tianmin Shu

TL;DR
This paper introduces CaPE, a multimodal path planning method that uses vision-language models and model-based verification to enable safe, interpretable, and cooperative multi-agent navigation and collaboration in diverse scenarios.
Contribution
It presents a novel approach combining vision-language models with model-based planning for safe, interpretable, and adaptable multi-agent path planning using language communication.
Findings
CaPE effectively integrates with robotic systems as a plug-and-play module.
It enhances multi-agent cooperation safety and interpretability.
Experimental results show improved alignment of plans with language communication.
Abstract
Successful cooperation among decentralized agents requires each agent to quickly adapt its plan to the behavior of other agents. In scenarios where agents cannot confidently predict one another's intentions and plans, language communication can be crucial for ensuring safety. In this work, we focus on path-level cooperation in which agents must adapt their paths to one another in order to avoid collisions or perform physical collaboration such as joint carrying. In particular, we propose a safe and interpretable multimodal path planning method, CaPE (Code as Path Editor), which generates and updates path plans for an agent based on the environment and language communication from other agents. CaPE leverages a vision-language model (VLM) to synthesize a path editing program verified by a model-based planner, grounding communication to path plan updates in a safe and interpretable way. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Path Planning Algorithms · Multimodal Machine Learning Applications · Social Robot Interaction and HRI
