Interleaved LLM and Motion Planning for Generalized Multi-Object Collection in Large Scene Graphs
Ruochu Yang, Yu Zhou, Fumin Zhang, Mengxue Hou

TL;DR
This paper introduces Inter-LLM, a novel interleaved language model and motion planning algorithm that significantly improves multi-object collection tasks in large scene graphs, enhancing robot performance in complex, open-set environments.
Contribution
The paper presents a new interleaved LLM and motion planning approach with a multimodal action cost function for better long-horizon planning in large, uncertain environments.
Findings
30% improvement in mission success rate
Enhanced efficiency in multi-object collection tasks
Better handling of open-set objects and large environments
Abstract
Household robots have been a longstanding research topic, but they still lack human-like intelligence, particularly in manipulating open-set objects and navigating large environments efficiently and accurately. To push this boundary, we consider a generalized multi-object collection problem in large scene graphs, where the robot needs to pick up and place multiple objects across multiple locations in a long mission of multiple human commands. This problem is extremely challenging since it requires long-horizon planning in a vast action-state space under high uncertainties. To this end, we propose a novel interleaved LLM and motion planning algorithm Inter-LLM. By designing a multimodal action cost similarity function, our algorithm can both reflect the history and look into the future to optimize plans, striking a good balance of quality and efficiency. Simulation experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
