Rollout Heuristics for Online Stochastic Contingent Planning

Oded Blumenthal; Guy Shani

arXiv:2310.02345·cs.AI·October 5, 2023·AREA@ECAI

Rollout Heuristics for Online Stochastic Contingent Planning

Oded Blumenthal, Guy Shani

PDF

TL;DR

This paper enhances online stochastic contingent planning for POMDPs by integrating domain-independent heuristics into Monte-Carlo planning, improving decision-making efficiency without relying on domain-specific heuristics.

Contribution

It introduces two novel heuristics for POMCP, leveraging classical planning heuristics and belief space analysis, to improve rollout quality in stochastic contingent planning.

Findings

01

Heuristics improve planning efficiency.

02

Belief space heuristic accounts for information value.

03

Domain-independent heuristics reduce reliance on domain-specific tuning.

Abstract

Partially observable Markov decision processes (POMDP) are a useful model for decision-making under partial observability and stochastic actions. Partially Observable Monte-Carlo Planning is an online algorithm for deciding on the next action to perform, using a Monte-Carlo tree search approach, based on the UCT (UCB applied to trees) algorithm for fully observable Markov-decision processes. POMCP develops an action-observation tree, and at the leaves, uses a rollout policy to provide a value estimate for the leaf. As such, POMCP is highly dependent on the rollout policy to compute good estimates, and hence identify good actions. Thus, many practitioners who use POMCP are required to create strong, domain-specific heuristics. In this paper, we model POMDPs as stochastic contingent planning problems. This allows us to leverage domain-independent heuristics that were developed in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMonte-Carlo Tree Search