Planning to the Information Horizon of BAMDPs via Epistemic State   Abstraction

Dilip Arumugam; Satinder Singh

arXiv:2210.16872·cs.LG·November 1, 2022·1 cites

Planning to the Information Horizon of BAMDPs via Epistemic State Abstraction

Dilip Arumugam, Satinder Singh

PDF

Open Access 1 Video

TL;DR

This paper introduces a complexity measure for BAMDP planning that captures the difficulty of information gathering, and proposes an abstraction-based approach to enable more efficient approximate planning in Bayesian reinforcement learning.

Contribution

It defines a new complexity measure for BAMDPs, and develops an abstraction method that reduces complexity for more tractable approximate planning.

Findings

01

The complexity measure highlights worst-case information acquisition difficulty.

02

An intractable exact planning algorithm demonstrates the measure's significance.

03

A state abstraction reduces complexity, enabling approximate planning.

Abstract

The Bayes-Adaptive Markov Decision Process (BAMDP) formalism pursues the Bayes-optimal solution to the exploration-exploitation trade-off in reinforcement learning. As the computation of exact solutions to Bayesian reinforcement-learning problems is intractable, much of the literature has focused on developing suitable approximation algorithms. In this work, before diving into algorithm design, we first define, under mild structural assumptions, a complexity measure for BAMDP planning. As efficient exploration in BAMDPs hinges upon the judicious acquisition of information, our complexity measure highlights the worst-case difficulty of gathering information and exhausting epistemic uncertainty. To illustrate its significance, we establish a computationally-intractable, exact planning algorithm that takes advantage of this measure to show more efficient planning. We then conclude by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Planning to the Information Horizon of BAMDPs via Epistemic State Abstraction· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning and Algorithms · Adversarial Robustness in Machine Learning