Reducing Planning Complexity of General Reinforcement Learning with   Non-Markovian Abstractions

Sultan J. Majeed; Marcus Hutter

arXiv:2112.13386·cs.AI·December 28, 2021

Reducing Planning Complexity of General Reinforcement Learning with Non-Markovian Abstractions

Sultan J. Majeed, Marcus Hutter

PDF

Open Access

TL;DR

This paper introduces a new non-Markovian abstraction for general reinforcement learning that significantly reduces the complexity of planning by providing tighter bounds on the number of states needed for effective surrogate MDPs.

Contribution

The paper proposes a novel non-MDP abstraction that improves upon the existing ESA framework by offering much tighter upper bounds on the state complexity for planning in GRL.

Findings

01

New non-MDP abstraction with improved upper bounds

02

Bound reduced from exponential to near-logarithmic in actions

03

Action-sequentialization further tightens the bound

Abstract

The field of General Reinforcement Learning (GRL) formulates the problem of sequential decision-making from ground up. The history of interaction constitutes a "ground" state of the system, which never repeats. On the one hand, this generality allows GRL to model almost every domain possible, e.g.\ Bandits, MDPs, POMDPs, PSRs, and history-based environments. On the other hand, in general, the near-optimal policies in GRL are functions of complete history, which hinders not only learning but also planning in GRL. The usual way around for the planning part is that the agent is given a Markovian abstraction of the underlying process. So, it can use any MDP planning algorithm to find a near-optimal policy. The Extreme State Aggregation (ESA) framework has extended this idea to non-Markovian abstractions without compromising on the possibility of planning through a (surrogate) MDP. A…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning and Algorithms · Formal Methods in Verification