MORE: Mobile Manipulation Rearrangement Through Grounded Language Reasoning
Mohammad Mohammadi, Daniel Honerkamp, Martin B\"uchner, Matteo, Cassinelli, Tim Welschehold, Fabien Despinoy, Igor Gilitschenski, Abhinav, Valada

TL;DR
MORE enhances language model-based mobile manipulation planning by using scene graphs and active filtering, enabling reliable zero-shot rearrangement in diverse indoor and outdoor environments, and outperforming recent methods on the BEHAVIOR-1K benchmark.
Contribution
It introduces a novel scene graph-based approach with active filtering for improved zero-shot mobile manipulation planning.
Findings
Successfully solves a significant share of BEHAVIOR-1K benchmark tasks.
Outperforms recent foundation model-based approaches.
Demonstrates capabilities in complex real-world activities.
Abstract
Autonomous long-horizon mobile manipulation encompasses a multitude of challenges, including scene dynamics, unexplored areas, and error recovery. Recent works have leveraged foundation models for scene-level robotic reasoning and planning. However, the performance of these methods degrades when dealing with a large number of objects and large-scale environments. To address these limitations, we propose MORE, a novel approach for enhancing the capabilities of language models to solve zero-shot mobile manipulation planning for rearrangement tasks. MORE leverages scene graphs to represent environments, incorporates instance differentiation, and introduces an active filtering scheme that extracts task-relevant subgraphs of object and region instances. These steps yield a bounded planning problem, effectively mitigating hallucinations and improving reliability. Additionally, we introduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · AI in Service Interactions · Natural Language Processing Techniques
