Adaptive importance sampling for control and inference

Hilbert Johan Kappen; Hans Christian Ruiz

arXiv:1505.01874·cs.SY·March 23, 2016

Adaptive importance sampling for control and inference

Hilbert Johan Kappen, Hans Christian Ruiz

PDF

TL;DR

This paper introduces the Path Integral Cross Entropy (PICE) method for learning and representing state-feedback controllers in non-linear stochastic control problems, enabling efficient control computation and improved inference in latent state models.

Contribution

It develops a gradient descent approach to learn feedback controllers using the cross entropy method within the path integral control framework.

Findings

01

PICE effectively learns controllers for non-linear stochastic control problems.

02

The method provides an accurate alternative to particle filtering in neural data inference.

03

Demonstrated successful application on simple control examples.

Abstract

Path integral (PI) control problems are a restricted class of non-linear control problems that can be solved formally as a Feyman-Kac path integral and can be estimated using Monte Carlo sampling. In this contribution we review path integral control theory in the finite horizon case. We subsequently focus on the problem how to compute and represent control solutions. Within the PI theory, the question of how to compute becomes the question of importance sampling. Efficient importance samplers are state feedback controllers and the use of these requires an efficient representation. Learning and representing effective state-feedback controllers for non-linear stochastic control problems is a very challenging, and largely unsolved, problem. We show how to learn and represent such controllers using ideas from the cross entropy method. We derive a gradient descent method that allows to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.