RecoveryChaining: Learning Local Recovery Policies for Robust Manipulation
Shivam Vats, Devesh K. Jha, Maxim Likhachev, Oliver Kroemer, Diego, Romeres

TL;DR
RecoveryChaining is a hierarchical reinforcement learning method that enables robots to learn robust recovery policies for manipulation tasks, improving resilience to failures and enabling sim-to-real transfer.
Contribution
It introduces a hybrid action space with nominal controllers as options, allowing effective learning of recovery policies even with sparse rewards.
Findings
Learned recovery policies outperform baselines in manipulation tasks.
Successfully transferred policies from simulation to real robot.
Demonstrated robustness to actuation noise and partial observability.
Abstract
Model-based planners and controllers are commonly used to solve complex manipulation problems as they can efficiently optimize diverse objectives and generalize to long horizon tasks. However, they often fail during deployment due to noisy actuation, partial observability and imperfect models. To enable a robot to recover from such failures, we propose to use hierarchical reinforcement learning to learn a recovery policy. The recovery policy is triggered when a failure is detected based on sensory observations and seeks to take the robot to a state from which it can complete the task using the nominal model-based controllers. Our approach, called RecoveryChaining, uses a hybrid action space, where the model-based controllers are provided as additional \emph{nominal} options which allows the recovery policy to decide how to recover, when to switch to a nominal controller and which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning
