Residual Policy Learning

Tom Silver; Kelsey Allen; Josh Tenenbaum; Leslie Kaelbling

arXiv:1812.06298·cs.RO·January 4, 2019·52 cites

Residual Policy Learning

Tom Silver, Kelsey Allen, Josh Tenenbaum, Leslie Kaelbling

PDF

Open Access 1 Repo

TL;DR

Residual Policy Learning (RPL) enhances existing controllers in robotic tasks by learning residuals with deep reinforcement learning, significantly improving performance in complex, real-world scenarios where traditional RL struggles.

Contribution

The paper introduces RPL, a simple method to improve nondifferentiable policies using model-free deep RL, effective in complex robotic manipulation tasks with imperfect initial controllers.

Findings

01

RPL improves performance across six challenging MuJoCo tasks.

02

RPL enables long-horizon, sparse-reward tasks that standard RL cannot handle.

03

RPL consistently enhances initial controllers, combining RL and control strengths.

Abstract

We present Residual Policy Learning (RPL): a simple method for improving nondifferentiable policies using model-free deep reinforcement learning. RPL thrives in complex robotic manipulation tasks where good but imperfect controllers are available. In these tasks, reinforcement learning from scratch remains data-inefficient or intractable, but learning a residual on top of the initial controller can yield substantial improvements. We study RPL in six challenging MuJoCo tasks involving partial observability, sensor noise, model misspecification, and controller miscalibration. For initial controllers, we consider both hand-designed policies and model-predictive controllers with known or learned transition models. By combining learning with control algorithms, RPL can perform long-horizon, sparse-reward tasks for which reinforcement learning alone fails. Moreover, we find that RPL…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

k-r-allen/residual-policy-learning
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Robot Manipulation and Learning