Guided Reinforcement Learning for Robust Multi-Contact Loco-Manipulation
Jean-Pierre Sleiman, Mayank Mittal, Marco Hutter

TL;DR
This paper presents a systematic RL approach for multi-contact loco-manipulation that uses a task-independent MDP and minimal demonstrations, achieving robust policies capable of recovery and successful real-world transfer.
Contribution
It introduces a task-independent MDP framework with single demonstration training and adaptive dynamics for robust multi-contact manipulation, advancing prior imitation RL methods.
Findings
Higher success rates compared to prior methods
Policies learn recovery maneuvers not in demonstrations
Successful transfer to real robots
Abstract
Reinforcement learning (RL) often necessitates a meticulous Markov Decision Process (MDP) design tailored to each task. This work aims to address this challenge by proposing a systematic approach to behavior synthesis and control for multi-contact loco-manipulation tasks, such as navigating spring-loaded doors and manipulating heavy dishwashers. We define a task-independent MDP to train RL policies using only a single demonstration per task generated from a model-based trajectory optimizer. Our approach incorporates an adaptive phase dynamics formulation to robustly track the demonstrations while accommodating dynamic uncertainties and external disturbances. We compare our method against prior motion imitation RL works and show that the learned policies achieve higher success rates across all considered tasks. These policies learn recovery maneuvers that are not present in the…
Peer Reviews
Decision·CoRL 2024
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Teleoperation and Haptic Systems · Robotic Mechanisms and Dynamics
