Path Integral Networks: End-to-End Differentiable Optimal Control
Masashi Okada, Luca Rigazio, Takenobu Aoshima

TL;DR
Path Integral Networks (PI-Net) is a fully differentiable recurrent network that models optimal control planning, capable of learning dynamics and costs end-to-end, and generalizing to unseen states in continuous control tasks.
Contribution
This paper introduces PI-Net, a novel neural network architecture that integrates the Path Integral optimal control algorithm into a differentiable framework for end-to-end learning.
Findings
PI-Net can successfully mimic control demonstrations in simulated environments.
PI-Net is capable of learning latent dynamics and cost models from demonstrations.
Preliminary results show effective planning and generalization in continuous control tasks.
Abstract
In this paper, we introduce Path Integral Networks (PI-Net), a recurrent network representation of the Path Integral optimal control algorithm. The network includes both system dynamics and cost models, used for optimal control based planning. PI-Net is fully differentiable, learning both dynamics and cost models end-to-end by back-propagation and stochastic gradient descent. Because of this, PI-Net can learn to plan. PI-Net has several advantages: it can generalize to unseen states thanks to planning, it can be applied to continuous control tasks, and it allows for a wide variety learning schemes, including imitation and reinforcement learning. Preliminary experiment results show that PI-Net, trained by imitation learning, can mimic control demonstrations for two simulated problems; a linear system and a pendulum swing-up problem. We also show that PI-Net is able to learn dynamics and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks Stability and Synchronization · Reinforcement Learning in Robotics · Advanced Control Systems Optimization
