Private and Common Information States in Decentralized Parallel Dynamic Programming for Delayed Sharing Patterns
Charalambos D. Charalambous, Umarbek Guvercin, Seddik Djouadi

TL;DR
This paper introduces a dynamic programming framework for decentralized stochastic control with delayed sharing, generalizing classical DP properties to complex information patterns.
Contribution
It develops a DP approach for delayed sharing patterns, establishing properties of value functions and information states in decentralized control.
Findings
Derived necessary and sufficient conditions for Person-by-Person optimality.
Formulated simplified DP equations using private and common information states.
Settled a long-standing open problem on delayed sharing patterns in DP.
Abstract
This paper develops a dynamic programming (DP) approach for decentralized stochastic optimal control problems with delayed sharing information patterns, which exhibits the fundamental Properties of classical DP of centralized partially observable Markov decision problems (POMDPs): the value functions and information states depend on the actions of the minimizing controls and not their strategies. This is achieved by invoking the concept of Person-by-Person (PbP) optimality, in which each control strategy is associated with a value function conditioned on its assigned delayed sharing information pattern, when all other strategies are fixed to their optimal responses. The value functions satisfy generalized and simplified DP equations. These are used to derive necessary and sufficient conditions for PbP optimality. The simplified DP equations are obtained by invoking the structural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
