Learn from the Past: Language-conditioned Object Rearrangement with Large Language Models
Guanqun Cao, Ryan Mckenna, Erich Graf, John Oyekan

TL;DR
This paper introduces a flexible, language-conditioned object rearrangement framework using Large Language Models that generalizes to various objects and instructions without prior training, mimicking human reasoning.
Contribution
It presents a novel zero-shot approach leveraging LLMs for robotic object rearrangement, reducing reliance on pre-collected datasets and enhancing generalization capabilities.
Findings
Effective execution of rearrangement tasks with long instruction sequences
Generalizes to various objects and instructions in zero-shot setting
Outperforms traditional dataset-dependent methods
Abstract
Object manipulation for rearrangement into a specific goal state is a significant task for collaborative robots. Accurately determining object placement is a key challenge, as misalignment can increase task complexity and the risk of collisions, affecting the efficiency of the rearrangement process. Most current methods heavily rely on pre-collected datasets to train the model for predicting the goal position. As a result, these methods are restricted to specific instructions, which limits their broader applicability and generalisation. In this paper, we propose a framework of flexible language-conditioned object rearrangement based on the Large Language Model (LLM). Our approach mimics human reasoning by making use of successful past experiences as a reference to infer the best strategies to achieve a current desired goal position. Based on LLM's strong natural language comprehension…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning
