Increasing the Value of Information During Planning in Uncertain   Environments

Gaurab Pokharel

arXiv:2409.13754·cs.AI·September 24, 2024

Increasing the Value of Information During Planning in Uncertain Environments

Gaurab Pokharel

PDF

Open Access

TL;DR

This paper introduces a new algorithm that enhances online planning in POMDPs by incorporating entropy into the UCB1 heuristic, improving decision-making when there are delays between information gathering and use.

Contribution

It proposes a novel modification to the POMCP algorithm by adding entropy to better value information-gathering actions during planning.

Findings

01

The new algorithm outperforms standard POMCP in the hallway problem.

02

Incorporating entropy improves the recognition of valuable information-gathering actions.

03

Results show significant performance gains in delayed-information scenarios.

Abstract

Prior studies have demonstrated that for many real-world problems, POMDPs can be solved through online algorithms both quickly and with near optimality. However, on an important set of problems where there is a large time delay between when the agent can gather information and when it needs to use that information, these solutions fail to adequately consider the value of information. As a result, information gathering actions, even when they are critical in the optimal policy, will be ignored by existing solutions, leading to sub-optimal decisions by the agent. In this research, we develop a novel solution that rectifies this problem by introducing a new algorithm that improves upon state-of-the-art online planning by better reflecting on the value of actions that gather information. We do this by adding Entropy to the UCB1 heuristic in the POMCP algorithm. We test this solution on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplex Systems and Decision Making

MethodsSparse Evolutionary Training