Improved POMDP Tree Search Planning with Prioritized Action Branching

John Mern; Anil Yildiz; Larry Bush; Tapan Mukerji; Mykel J.; Kochenderfer

arXiv:2010.03599·cs.LG·November 4, 2021

Improved POMDP Tree Search Planning with Prioritized Action Branching

John Mern, Anil Yildiz, Larry Bush, Tapan Mukerji, Mykel J., Kochenderfer

PDF

1 Repo 1 Video

TL;DR

This paper introduces PA-POMCPOW, a novel online POMDP solver that efficiently handles large action spaces by sampling actions based on a combined score of reward and information gain, improving planning performance.

Contribution

The paper presents PA-POMCPOW, a new action sampling method that enhances POMDP tree search by balancing exploration and exploitation in large action spaces.

Findings

01

Outperforms existing solvers on large action space problems

02

Effectively balances exploration and exploitation during search

03

Demonstrates scalability and improved decision quality

Abstract

Online solvers for partially observable Markov decision processes have difficulty scaling to problems with large action spaces. This paper proposes a method called PA-POMCPOW to sample a subset of the action space that provides varying mixtures of exploitation and exploration for inclusion in a search tree. The proposed method first evaluates the action space according to a score function that is a linear combination of expected reward and expected information gain. The actions with the highest score are then added to the search tree during tree expansion. Experiments show that PA-POMCPOW is able to outperform existing state-of-the-art solvers on problems with large discrete action spaces.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sisl/PA-POMCPOW.jl
none

Videos

Improved POMDP Tree Search Planning with Prioritized Action Branching· underline