Whole-Body Mobile Manipulation using Offline Reinforcement Learning on Sub-optimal Controllers
Snehal Jauhri, Vignesh Prasad, Georgia Chalvatzaki

TL;DR
This paper introduces WHOLE-MoMa, a two-stage offline reinforcement learning pipeline that improves whole-body mobile manipulation by leveraging sub-optimal controllers for data collection and policy refinement, enabling effective real-world task execution.
Contribution
It proposes a novel offline RL approach that uses sub-optimal controllers for data generation and extends implicit Q-learning with Q-chunking for complex coordination tasks.
Findings
Outperforms baseline methods in simulation tasks.
Achieves high success rates in real robot experiments.
Policies transfer directly without finetuning.
Abstract
Mobile Manipulation (MoMa) of articulated objects, such as opening doors, drawers, and cupboards, demands simultaneous, whole-body coordination between a robot's base and arms. Classical whole-body controllers (WBCs) can solve such problems via hierarchical optimization, but require extensive hand-tuned optimization and remain brittle. Learning-based methods, on the other hand, show strong generalization capabilities but typically rely on expensive whole-body teleoperation data or heavy reward engineering. We observe that even a sub-optimal WBC is a powerful structural prior: it can be used to collect data in a constrained, task-relevant region of the state-action space, and its behavior can still be improved upon using offline reinforcement learning. Building on this, we propose WHOLE-MoMa, a two-stage pipeline that first generates diverse demonstrations by randomizing a lightweight…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
