Model Predictive Actor-Critic: Accelerating Robot Skill Acquisition with   Deep Reinforcement Learning

Andrew S. Morgan; Daljeet Nandha; Georgia Chalvatzaki; Carlo D'Eramo,; Aaron M. Dollar; and Jan Peters

arXiv:2103.13842·cs.RO·November 1, 2021

Model Predictive Actor-Critic: Accelerating Robot Skill Acquisition with Deep Reinforcement Learning

Andrew S. Morgan, Daljeet Nandha, Georgia Chalvatzaki, Carlo D'Eramo,, Aaron M. Dollar, and Jan Peters

PDF

1 Repo

TL;DR

MoPAC is a hybrid deep reinforcement learning method that combines model predictive control with policy optimization to improve robot skill acquisition efficiently while reducing model bias and physical interactions.

Contribution

The paper introduces MoPAC, a novel hybrid model-based/model-free reinforcement learning algorithm that enhances sample efficiency and reduces model bias for robot training.

Findings

01

MoPAC outperforms state-of-the-art methods in simulation tasks.

02

MoPAC successfully trains a physical robotic hand for complex manipulation tasks.

03

The approach reduces physical interactions needed during training.

Abstract

Substantial advancements to model-based reinforcement learning algorithms have been impeded by the model-bias induced by the collected data, which generally hurts performance. Meanwhile, their inherent sample efficiency warrants utility for most robot applications, limiting potential damage to the robot and its environment during training. Inspired by information theoretic model predictive control and advances in deep reinforcement learning, we introduce Model Predictive Actor-Critic (MoPAC), a hybrid model-based/model-free method that combines model predictive rollouts with policy optimization as to mitigate model bias. MoPAC leverages optimal trajectories to guide policy learning, but explores via its model-free method, allowing the algorithm to learn more expressive dynamics models. This combination guarantees optimal skill learning up to an approximation error and reduces necessary…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dnandha/mopac
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.