Path Integral Networks: End-to-End Differentiable Optimal Control

Masashi Okada; Luca Rigazio; Takenobu Aoshima

arXiv:1706.09597·cs.AI·June 30, 2017·41 cites

Path Integral Networks: End-to-End Differentiable Optimal Control

Masashi Okada, Luca Rigazio, Takenobu Aoshima

PDF

Open Access

TL;DR

Path Integral Networks (PI-Net) is a fully differentiable recurrent network that models optimal control planning, capable of learning dynamics and costs end-to-end, and generalizing to unseen states in continuous control tasks.

Contribution

This paper introduces PI-Net, a novel neural network architecture that integrates the Path Integral optimal control algorithm into a differentiable framework for end-to-end learning.

Findings

01

PI-Net can successfully mimic control demonstrations in simulated environments.

02

PI-Net is capable of learning latent dynamics and cost models from demonstrations.

03

Preliminary results show effective planning and generalization in continuous control tasks.

Abstract

In this paper, we introduce Path Integral Networks (PI-Net), a recurrent network representation of the Path Integral optimal control algorithm. The network includes both system dynamics and cost models, used for optimal control based planning. PI-Net is fully differentiable, learning both dynamics and cost models end-to-end by back-propagation and stochastic gradient descent. Because of this, PI-Net can learn to plan. PI-Net has several advantages: it can generalize to unseen states thanks to planning, it can be applied to continuous control tasks, and it allows for a wide variety learning schemes, including imitation and reinforcement learning. Preliminary experiment results show that PI-Net, trained by imitation learning, can mimic control demonstrations for two simulated problems; a linear system and a pendulum swing-up problem. We also show that PI-Net is able to learn dynamics and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks Stability and Synchronization · Reinforcement Learning in Robotics · Advanced Control Systems Optimization