Universal Planning Networks
Aravind Srinivas, Allan Jabri, Pieter Abbeel, Sergey Levine, Chelsea, Finn

TL;DR
Universal Planning Networks (UPN) integrate differentiable planning into goal-directed policies, enabling effective visuomotor control, goal specification, and transfer across different robots through learned representations optimized via imitation learning.
Contribution
Introduction of UPN, a novel framework embedding differentiable planning within policies, improving goal generalization and transfer in visuomotor tasks.
Findings
Effective goal-directed visual imitation via gradient-based planning
Representations enable distance-based rewards for reinforcement learning
Successful transfer of planning strategies across different robot morphologies
Abstract
A key challenge in complex visuomotor control is learning abstract representations that are effective for specifying goals, planning, and generalization. To this end, we introduce universal planning networks (UPN). UPNs embed differentiable planning within a goal-directed policy. This planning computation unrolls a forward model in a latent space and infers an optimal action plan through gradient descent trajectory optimization. The plan-by-gradient-descent process and its underlying representations are learned end-to-end to directly optimize a supervised imitation learning objective. We find that the representations learned are not only effective for goal-directed visual imitation via gradient-based trajectory optimization, but can also provide a metric for specifying goals using images. The learned representations can be leveraged to specify distance-based rewards to reach new target…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Robotic Path Planning Algorithms
