Can LLMs Prove Robotic Path Planning Optimality? A Benchmark for Research-Level Algorithm Verification
Zhengbang Yang, Md. Tasin Tazwar, Minghan Wei, Zhuangdi Zhu

TL;DR
This paper introduces a benchmark to evaluate large language models' ability to verify the optimality of robotic path planning algorithms, revealing current limitations and potential improvements through context augmentation.
Contribution
It presents the first benchmark for research-level proofs of path planning optimality by LLMs, including diverse tasks and analysis of reasoning challenges.
Findings
State-of-the-art LLMs struggle with valid proofs without domain knowledge.
Providing task-specific lemmas improves reasoning more than generic prompts.
Error analysis identifies logical failures and hallucinations, guiding mitigation strategies.
Abstract
Robotic path planning problems are often NP-hard, and practical solutions typically rely on approximation algorithms with provable performance guarantees for general cases. While designing such algorithms is challenging, formally proving their approximation optimality is even more demanding, which requires domain-specific geometric insights and multi-step mathematical reasoning over complex operational constraints. Recent Large Language Models (LLMs) have demonstrated strong performance on mathematical reasoning benchmarks, yet their ability to assist with research-level optimality proofs in robotic path planning remains under-explored. In this work, we introduce the first benchmark for evaluating LLMs on approximation-ratio proofs of robotic path planning algorithms. The benchmark consists of 34 research-grade proof tasks spanning diverse planning problem types and complexity levels,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI-based Problem Solving and Planning · Robotic Path Planning Algorithms · Multimodal Machine Learning Applications
