On Dynamic Programming Theory for Leader-Follower Stochastic Games

Jilles Steeve Dibangoye; Thibaut Le Marre; Ocan Sankur; Fran\c{c}ois Schwarzentruber

arXiv:2512.05667·cs.GT·December 8, 2025

On Dynamic Programming Theory for Leader-Follower Stochastic Games

Jilles Steeve Dibangoye, Thibaut Le Marre, Ocan Sankur, Fran\c{c}ois Schwarzentruber

PDF

Open Access

TL;DR

This paper develops a dynamic programming framework for leader-follower stochastic games, enabling computation of strong Stackelberg equilibria with scalable algorithms and empirical validation on various benchmarks.

Contribution

It introduces a DP approach over credible sets, proves reduction to MDPs, and provides NP-hardness results and psilon-optimal algorithms for leader strategies.

Findings

01

Empirical gains in leader value over existing methods.

02

Scalable algorithms demonstrated on security, resource allocation, and adversarial planning.

03

Theoretical foundations for DP over credible sets and NP-hardness of optimal leader policy synthesis.

Abstract

Leader-follower general-sum stochastic games (LF-GSSGs) model sequential decision-making under asymmetric commitment, where a leader commits to a policy and a follower best responds, yielding a strong Stackelberg equilibrium (SSE) with leader-favourable tie-breaking. This paper introduces a dynamic programming (DP) framework that applies Bellman recursion over credible sets-state abstractions formally representing all rational follower best responses under partial leader commitments-to compute SSEs. We first prove that any LF-GSSG admits a lossless reduction to a Markov decision process (MDP) over credible sets. We further establish that synthesising an optimal memoryless deterministic leader policy is NP-hard, motivating the development of {\epsilon}-optimal DP algorithms with provable guarantees on leader exploitability. Experiments on standard mixed-motive benchmarks-including…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInfrastructure Resilience and Vulnerability Analysis · Game Theory and Applications · Reinforcement Learning in Robotics