Optimal and Approximate Q-value Functions for Decentralized POMDPs

Frans A. Oliehoek; Matthijs T. J. Spaan; Nikos Vlassis

arXiv:1111.0062·cs.AI·November 2, 2011

Optimal and Approximate Q-value Functions for Decentralized POMDPs

Frans A. Oliehoek, Matthijs T. J. Spaan, Nikos Vlassis

PDF

TL;DR

This paper explores defining and approximating Q-value functions for decentralized POMDPs, enabling more efficient policy computation and providing bounds on optimal solutions, with practical algorithms and experimental validation.

Contribution

It introduces two forms of optimal Q-value functions for Dec-POMDPs and analyzes approximate versions, unifying previous methods and offering new algorithms for policy extraction.

Findings

01

All approximate Q-value functions provide an upper bound to the optimal Q*.

02

Proposed algorithms can extract policies from approximate Q-values.

03

Experimental results validate the effectiveness of the approaches on test problems.

Abstract

Decision-theoretic planning is a popular approach to sequential decision making problems, because it treats uncertainty in sensing and acting in a principled way. In single-agent frameworks like MDPs and POMDPs, planning can be carried out by resorting to Q-value functions: an optimal Q-value function Q* is computed in a recursive manner by dynamic programming, and then an optimal policy is extracted from Q*. In this paper we study whether similar Q-value functions can be defined for decentralized POMDP models (Dec-POMDPs), and how policies can be extracted from such value functions. We define two forms of the optimal Q-value function for Dec-POMDPs: one that gives a normative description as the Q-value function of an optimal pure joint policy and another one that is sequentially rational and thus gives a recipe for computation. This computation, however, is infeasible for all but the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.