TP-MDDN: Task-Preferenced Multi-Demand-Driven Navigation with Autonomous Decision-Making
Shanshan Li, Da Huang, Yu He, Yanwei Fu, Yu-Gang Jiang, Xiangyang Xue

TL;DR
This paper introduces TP-MDDN, a benchmark for complex multi-demand navigation in embodied AI, and proposes an autonomous decision-making system with novel modules and spatial mapping techniques, achieving superior performance over existing methods.
Contribution
We present TP-MDDN, a new benchmark for multi-demand navigation, and develop AWMSystem with innovative modules and MASMap for improved autonomous decision-making in complex environments.
Findings
Outperforms state-of-the-art baselines in perception accuracy
Achieves higher navigation robustness in complex tasks
Demonstrates effective handling of multiple sub-demands and task preferences
Abstract
In daily life, people often move through spaces to find objects that meet their needs, posing a key challenge in embodied AI. Traditional Demand-Driven Navigation (DDN) handles one need at a time but does not reflect the complexity of real-world tasks involving multiple needs and personal choices. To bridge this gap, we introduce Task-Preferenced Multi-Demand-Driven Navigation (TP-MDDN), a new benchmark for long-horizon navigation involving multiple sub-demands with explicit task preferences. To solve TP-MDDN, we propose AWMSystem, an autonomous decision-making system composed of three key modules: BreakLLM (instruction decomposition), LocateLLM (goal selection), and StatusMLLM (task monitoring). For spatial memory, we design MASMap, which combines 3D point cloud accumulation with 2D semantic mapping for accurate and efficient environmental understanding. Our Dual-Tempo action…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Robotics and Sensor-Based Localization · Spatial Cognition and Navigation
