Closing the Learning-Planning Loop with Predictive State Representations
Byron Boots, Sajid M. Siddiqi, Geoffrey J. Gordon

TL;DR
This paper introduces a spectral algorithm for learning Predictive State Representations from action-observation sequences, enabling effective planning in partially observable environments, demonstrated on a vision-based robot task.
Contribution
It presents a novel, efficient, and statistically consistent spectral method for learning PSRs directly from data, facilitating near-optimal planning in complex environments.
Findings
The algorithm accurately learns environment models from sequences.
Learned models capture essential environment features efficiently.
Planning in learned PSRs yields successful task performance.
Abstract
A central problem in artificial intelligence is that of planning to maximize future reward under uncertainty in a partially observable environment. In this paper we propose and demonstrate a novel algorithm which accurately learns a model of such an environment directly from sequences of action-observation pairs. We then close the loop from observations to actions by planning in the learned model and recovering a policy which is near-optimal in the original environment. Specifically, we present an efficient and statistically consistent spectral algorithm for learning the parameters of a Predictive State Representation (PSR). We demonstrate the algorithm by learning a model of a simulated high-dimensional, vision-based mobile robot planning task, and then perform approximate point-based planning in the learned PSR. Analysis of our results shows that the algorithm learns a state space…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Data Stream Mining Techniques · Bayesian Modeling and Causal Inference
