Interleaved LLM and Motion Planning for Generalized Multi-Object Collection in Large Scene Graphs

Ruochu Yang; Yu Zhou; Fumin Zhang; Mengxue Hou

arXiv:2507.15782·cs.RO·July 22, 2025

Interleaved LLM and Motion Planning for Generalized Multi-Object Collection in Large Scene Graphs

Ruochu Yang, Yu Zhou, Fumin Zhang, Mengxue Hou

PDF

TL;DR

This paper introduces Inter-LLM, a novel interleaved language model and motion planning algorithm that significantly improves multi-object collection tasks in large scene graphs, enhancing robot performance in complex, open-set environments.

Contribution

The paper presents a new interleaved LLM and motion planning approach with a multimodal action cost function for better long-horizon planning in large, uncertain environments.

Findings

01

30% improvement in mission success rate

02

Enhanced efficiency in multi-object collection tasks

03

Better handling of open-set objects and large environments

Abstract

Household robots have been a longstanding research topic, but they still lack human-like intelligence, particularly in manipulating open-set objects and navigating large environments efficiently and accurately. To push this boundary, we consider a generalized multi-object collection problem in large scene graphs, where the robot needs to pick up and place multiple objects across multiple locations in a long mission of multiple human commands. This problem is extremely challenging since it requires long-horizon planning in a vast action-state space under high uncertainties. To this end, we propose a novel interleaved LLM and motion planning algorithm Inter-LLM. By designing a multimodal action cost similarity function, our algorithm can both reflect the history and look into the future to optimize plans, striking a good balance of quality and efficiency. Simulation experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.