Universal Planning Networks

Aravind Srinivas; Allan Jabri; Pieter Abbeel; Sergey Levine; Chelsea; Finn

arXiv:1804.00645·cs.LG·April 5, 2018·92 cites

Universal Planning Networks

Aravind Srinivas, Allan Jabri, Pieter Abbeel, Sergey Levine, Chelsea, Finn

PDF

Open Access 1 Repo

TL;DR

Universal Planning Networks (UPN) integrate differentiable planning into goal-directed policies, enabling effective visuomotor control, goal specification, and transfer across different robots through learned representations optimized via imitation learning.

Contribution

Introduction of UPN, a novel framework embedding differentiable planning within policies, improving goal generalization and transfer in visuomotor tasks.

Findings

01

Effective goal-directed visual imitation via gradient-based planning

02

Representations enable distance-based rewards for reinforcement learning

03

Successful transfer of planning strategies across different robot morphologies

Abstract

A key challenge in complex visuomotor control is learning abstract representations that are effective for specifying goals, planning, and generalization. To this end, we introduce universal planning networks (UPN). UPNs embed differentiable planning within a goal-directed policy. This planning computation unrolls a forward model in a latent space and infers an optimal action plan through gradient descent trajectory optimization. The plan-by-gradient-descent process and its underlying representations are learned end-to-end to directly optimize a supervised imitation learning objective. We find that the representations learned are not only effective for goal-directed visual imitation via gradient-based trajectory optimization, but can also provide a metric for specifying goals using images. The learned representations can be leveraged to specify distance-based rewards to reach new target…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aravindsrinivas/upn
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Robotic Path Planning Algorithms