Improved Memory-Bounded Dynamic Programming for Decentralized POMDPs
Sven Seuken, Shlomo Zilberstein

TL;DR
This paper enhances the Memory-Bounded Dynamic Programming algorithm for decentralized POMDPs by reducing complexity, providing error bounds, and demonstrating improved scalability through experiments on larger benchmarks.
Contribution
The paper generalizes and improves MBDP, reducing its complexity from exponential to polynomial in the number of observations, and provides theoretical and empirical validation.
Findings
Reduced complexity from exponential to polynomial in observations
Provided error bounds on solution quality
Scalable performance demonstrated on larger benchmarks
Abstract
Memory-Bounded Dynamic Programming (MBDP) has proved extremely effective in solving decentralized POMDPs with large horizons. We generalize the algorithm and improve its scalability by reducing the complexity with respect to the number of observations from exponential to polynomial. We derive error bounds on solution quality with respect to this new approximation and analyze the convergence behavior. To evaluate the effectiveness of the improvements, we introduce a new, larger benchmark problem. Experimental results show that despite the high complexity of decentralized POMDPs, scalable solution techniques such as MBDP perform surprisingly well.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Search Problems · Auction Theory and Applications · Reinforcement Learning in Robotics
