Partially Observable Monte-Carlo Graph Search

Yang You; Vincent Thomas; Alex Schutz; Robert Skilton; Nick Hawes; Olivier Buffet

arXiv:2507.20951·cs.AI·July 29, 2025

Partially Observable Monte-Carlo Graph Search

Yang You, Vincent Thomas, Alex Schutz, Robert Skilton, Nick Hawes, Olivier Buffet

PDF

TL;DR

This paper introduces POMCGS, a novel offline sampling-based algorithm that constructs policy graphs for large POMDPs, enabling scalable, analyzable solutions that outperform previous offline methods and compete with online algorithms.

Contribution

POMCGS is the first offline algorithm to efficiently generate policies for large POMDPs by folding search trees into policy graphs, incorporating action widening and observation clustering.

Findings

01

POMCGS can solve large, challenging POMDPs previously unsolvable offline.

02

Generated policies are competitive with state-of-the-art online algorithms.

03

The method effectively handles continuous POMDPs with proposed techniques.

Abstract

Currently, large partially observable Markov decision processes (POMDPs) are often solved by sampling-based online methods which interleave planning and execution phases. However, a pre-computed offline policy is more desirable in POMDP applications with time or energy constraints. But previous offline algorithms are not able to scale up to large POMDPs. In this article, we propose a new sampling-based algorithm, the partially observable Monte-Carlo graph search (POMCGS) to solve large POMDPs offline. Different from many online POMDP methods, which progressively develop a tree while performing (Monte-Carlo) simulations, POMCGS folds this search tree on the fly to construct a policy graph, so that computations can be drastically reduced, and users can analyze and validate the policy prior to embedding and executing it. Moreover, POMCGS, together with action progressive widening and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.