Solving infinite-horizon Dec-POMDPs using Finite State Controllers   within JESP

Yang You; Vincent Thomas; Francis Colas; Olivier Buffet

arXiv:2109.08755·cs.AI·September 21, 2021

Solving infinite-horizon Dec-POMDPs using Finite State Controllers within JESP

Yang You, Vincent Thomas, Francis Colas, Olivier Buffet

PDF

TL;DR

This paper extends the JESP algorithm to infinite-horizon Dec-POMDPs by utilizing finite state controllers, enabling the computation of Nash equilibria in more complex collaborative planning scenarios.

Contribution

It introduces extit{infJESP}, a novel variant of JESP for infinite-horizon Dec-POMDPs using FSCs, along with heuristic initializations and experimental validation.

Findings

01

Effective in benchmark problems

02

Outperforms finite-horizon approaches

03

Provides scalable solutions

Abstract

This paper looks at solving collaborative planning problems formalized as Decentralized POMDPs (Dec-POMDPs) by searching for Nash equilibria, i.e., situations where each agent's policy is a best response to the other agents' (fixed) policies. While the Joint Equilibrium-based Search for Policies (JESP) algorithm does this in the finite-horizon setting relying on policy trees, we propose here to adapt it to infinite-horizon Dec-POMDPs by using finite state controller (FSC) policy representations. In this article, we (1) explain how to turn a Dec-POMDP with $N - 1$ fixed FSCs into an infinite-horizon POMDP whose solution is an $N^{th}$ agent best response; (2) propose a JESP variant, called \infJESP, using this to solve infinite-horizon Dec-POMDPs; (3) introduce heuristic initializations for JESP aiming at leading to good solutions; and (4) conduct experiments on state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.