MMP-A*: Multimodal Perception Enhanced Incremental Heuristic Search on Path Planning

Minh Hieu Ha; Khanh Ly Ta; Hung Phan; Tung Doan; Tung Dao; Dao Tran; Huynh Thi Thanh Binh

arXiv:2601.01910·cs.AI·January 23, 2026

MMP-A*: Multimodal Perception Enhanced Incremental Heuristic Search on Path Planning

Minh Hieu Ha, Khanh Ly Ta, Hung Phan, Tung Doan, Tung Dao, Dao Tran, Huynh Thi Thanh Binh

PDF

Open Access

TL;DR

MMP-A* is a multimodal path planning framework that combines vision-language models with an adaptive heuristic to produce efficient, accurate, and geometry-aware navigation in complex environments.

Contribution

It introduces a novel multimodal framework with an adaptive decay mechanism that improves path planning by grounding high-level reasoning in physical geometry.

Findings

01

Achieves near-optimal trajectories in cluttered environments

02

Reduces computational and memory costs significantly

03

Outperforms text-only guidance methods in complex scenarios

Abstract

Autonomous path planning requires a synergy between global reasoning and geometric precision, especially in complex or cluttered environments. While classical A* is valued for its optimality, it incurs prohibitive computational and memory costs in large-scale scenarios. Recent attempts to mitigate these limitations by using Large Language Models for waypoint guidance remain insufficient, as they rely only on text-based reasoning without spatial grounding. As a result, such models often produce incorrect waypoints in topologically complex environments with dead ends, and lack the perceptual capacity to interpret ambiguous physical boundaries. These inconsistencies lead to costly corrective expansions and undermine the intended computational efficiency. We introduce MMP-A*, a multimodal framework that integrates the spatial grounding capabilities of vision-language models with a novel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotic Path Planning Algorithms · Multimodal Machine Learning Applications · Robotics and Sensor-Based Localization