Closing the Learning-Planning Loop with Predictive State Representations

Byron Boots; Sajid M. Siddiqi; Geoffrey J. Gordon

arXiv:0912.2385·cs.LG·December 15, 2009

Closing the Learning-Planning Loop with Predictive State Representations

Byron Boots, Sajid M. Siddiqi, Geoffrey J. Gordon

PDF

Open Access

TL;DR

This paper introduces a spectral algorithm for learning Predictive State Representations from action-observation sequences, enabling effective planning in partially observable environments, demonstrated on a vision-based robot task.

Contribution

It presents a novel, efficient, and statistically consistent spectral method for learning PSRs directly from data, facilitating near-optimal planning in complex environments.

Findings

01

The algorithm accurately learns environment models from sequences.

02

Learned models capture essential environment features efficiently.

03

Planning in learned PSRs yields successful task performance.

Abstract

A central problem in artificial intelligence is that of planning to maximize future reward under uncertainty in a partially observable environment. In this paper we propose and demonstrate a novel algorithm which accurately learns a model of such an environment directly from sequences of action-observation pairs. We then close the loop from observations to actions by planning in the learned model and recovering a policy which is near-optimal in the original environment. Specifically, we present an efficient and statistically consistent spectral algorithm for learning the parameters of a Predictive State Representation (PSR). We demonstrate the algorithm by learning a model of a simulated high-dimensional, vision-based mobile robot planning task, and then perform approximate point-based planning in the learned PSR. Analysis of our results shows that the algorithm learns a state space…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Data Stream Mining Techniques · Bayesian Modeling and Causal Inference