Efficient Planning under Partial Observability with Unnormalized Q   Functions and Spectral Learning

Tianyu Li; Bogdan Mazoure; Doina Precup; Guillaume Rabusseau

arXiv:1911.05010·cs.AI·November 25, 2019·1 cites

Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning

Tianyu Li, Bogdan Mazoure, Doina Precup, Guillaume Rabusseau

PDF

Open Access

TL;DR

This paper introduces a unified approach for learning and planning in partially observable environments, leveraging spectral learning and unnormalized Q functions to improve efficiency and theoretical guarantees.

Contribution

It proposes a novel algorithm that integrates learning and planning, inspired by spectral methods, with proven theoretical guarantees and practical efficiency.

Findings

01

More sample-efficient than classical methods

02

Faster in terms of computation time

03

Validated on two domains with improved performance

Abstract

Learning and planning in partially-observable domains is one of the most difficult problems in reinforcement learning. Traditional methods consider these two problems as independent, resulting in a classical two-stage paradigm: first learn the environment dynamics and then plan accordingly. This approach, however, disconnects the two problems and can consequently lead to algorithms that are sample inefficient and time consuming. In this paper, we propose a novel algorithm that combines learning and planning together. Our algorithm is closely related to the spectral learning algorithm for predicitive state representations and offers appealing theoretical guarantees and time complexity. We empirically show on two domains that our approach is more sample and time efficient compared to classical methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Gene Regulatory Network Analysis · Receptor Mechanisms and Signaling