Anytime Planning for Decentralized POMDPs using Expectation Maximization

Akshat Kumar; Shlomo Zilberstein

arXiv:1203.3490·cs.AI·March 19, 2012·32 cites

Anytime Planning for Decentralized POMDPs using Expectation Maximization

Akshat Kumar, Shlomo Zilberstein

PDF

Open Access

TL;DR

This paper introduces a novel approach for infinite-horizon decentralized POMDPs by framing the policy optimization as inference in dynamic Bayesian networks and employing Expectation Maximization, enabling richer representations and improved performance.

Contribution

It presents a new class of algorithms that recast infinite-horizon DEC-POMDP optimization as inference in DBNs, utilizing EM for joint policy optimization, and demonstrates superior results on benchmarks.

Findings

01

EM-based approach outperforms existing solvers

02

Supports richer state and action representations

03

Enables inference techniques in complex DEC-POMDPs

Abstract

Decentralized POMDPs provide an expressive framework for multi-agent sequential decision making. While fnite-horizon DECPOMDPs have enjoyed signifcant success, progress remains slow for the infnite-horizon case mainly due to the inherent complexity of optimizing stochastic controllers representing agent policies. We present a promising new class of algorithms for the infnite-horizon case, which recasts the optimization problem as inference in a mixture of DBNs. An attractive feature of this approach is the straightforward adoption of existing inference techniques in DBNs for solving DEC-POMDPs and supporting richer representations such as factored or continuous states and actions. We also derive the Expectation Maximization (EM) algorithm to optimize the joint policy represented as DBNs. Experiments on benchmark domains show that EM compares favorably against the state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Reinforcement Learning in Robotics · Advanced Control Systems Optimization