Composable Deep Reinforcement Learning for Robotic Manipulation

Tuomas Haarnoja; Vitchyr Pong; Aurick Zhou; Murtaza Dalal; Pieter; Abbeel; Sergey Levine

arXiv:1803.06773·cs.LG·March 20, 2018

Composable Deep Reinforcement Learning for Robotic Manipulation

Tuomas Haarnoja, Vitchyr Pong, Aurick Zhou, Murtaza Dalal, Pieter, Abbeel, Sergey Levine

PDF

1 Repo

TL;DR

This paper explores applying soft Q-learning to real-world robotic manipulation, emphasizing its ability to learn expressive, multimodal policies and to compose existing skills efficiently, outperforming prior methods in sample efficiency.

Contribution

It demonstrates how soft Q-learning can be effectively used for real-world robotic tasks, highlighting its compositional capabilities and improved sample efficiency over previous approaches.

Findings

01

Soft Q-learning learns expressive, multimodal policies.

02

Policies can be composed to create new skills.

03

Method shows superior sample efficiency in real-world tasks.

Abstract

Model-free deep reinforcement learning has been shown to exhibit good performance in domains ranging from video games to simulated robotic manipulation and locomotion. However, model-free methods are known to perform poorly when the interaction time with the environment is limited, as is the case for most real-world robotic tasks. In this paper, we study how maximum entropy policies trained using soft Q-learning can be applied to real-world robotic manipulation. The application of this method to real-world manipulation is facilitated by two important features of soft Q-learning. First, soft Q-learning can learn multimodal exploration strategies by learning policies represented by expressive energy-based models. Second, we show that policies learned with soft Q-learning can be composed to create new policies, and that the optimality of the resulting policy can be bounded in terms of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

haarnoja/softqlearning
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsQ-Learning