Operator Learning for Families of Finite-State Mean-Field Games
William Hofgard, Asaf Cohen, Mathieu Lauri\`ere

TL;DR
This paper introduces an operator learning framework for efficiently solving parametric families of finite-state mean-field games, providing theoretical guarantees and demonstrating high accuracy on cybersecurity and high-dimensional benchmark models.
Contribution
The paper presents a novel operator learning approach that generalizes across different initial conditions and costs in finite-state MFGs, with theoretical error bounds and empirical validation.
Findings
Achieves accurate approximation of MFG solutions in cybersecurity example
Effective in high-dimensional quadratic MFG models
Provides theoretical guarantees on approximation error and generalization
Abstract
Finite-state mean-field games (MFGs) arise as limits of large interacting particle systems and are governed by an MFG system, a coupled forward-backward differential equation consisting of a forward Kolmogorov-Fokker-Planck (KFP) equation describing the population distribution and a backward Hamilton-Jacobi-Bellman (HJB) equation defining the value function. Solving MFG systems efficiently is challenging, with the structure of each system depending on an initial distribution of players and the terminal cost of the game. We propose an operator learning framework that solves parametric families of MFGs, enabling generalization without retraining for new initial distributions and terminal costs. We provide theoretical guarantees on the approximation error, parametric complexity, and generalization performance of our method, based on a novel regularity result for an appropriately defined…
Peer Reviews
Decision·Submitted to ICLR 2026
- This work is well-motivated, as learning MFGs is in general challenging and computation-heavy. - The paper is well-written and easy to follow. - Theoretical guarantees are discussed for the supervised learning method. - Various numerical experiments are conducted, with their settings and results clearly presented.
- Any finite state space can be embedded into a continuous state space. It's thus not clear to me why addressing only finite state spaces is a contribution of this work over those addressing continuous state spaces. - The contribution of the work over Cohen et al. (2024) is the ability of handling different terminal cost functions, which seems marginal to me. - Separable running cost function (action and population are decoupled) is considered, which is a huge simplification that needs further c
- The primary strength of this work is its framing of the problem. Moving from solving single MFG instances to learning an operator for entire parameterized families is a significant and practical step forward. This approach enables rapid, retrain-free generalization to new initial conditions and cost functions, which is highly valuable for applicability of MFGs. - The paper provides rigorous theoretical guarantees for the approximation error and generalization error. The results require common
- The paper honestly notes that learning becomes "increasingly unstable" for dimensions beyond d=10. The success in d=20 is shown for a simplified setting with a fixed time discretization. This suggests that the primary method has practical scalability limits, since only 10 to 20 states could be too little in practice. - The theoretical guarantees, while valuable, also exhibit a "curse of dimensionality" in their dependence on the state. These bounds suggest that the methodology in general doe
* The studied problem is well-motivated, and the relationship between the current work and related works is clearly stated. * The experimental settings are described in detail. * The authors have provided both theoretical analysis and experimental results on their proposed methods. * The proposed method has the potential to be used to solve MFGs efficiently, because it does not need to be re-trained to adapt to a specific choice of the MFG.
**Theoretical side**: I think the theoretical bounds provided in this paper are loose, and I doubt whether it is possible to use them to gain practically relevant insights. Especially, my concerns mainly rely on the following points: * Both in Corollary 4.3 and 4.5, the dependency on the parameter $\mathcal{K}$ is omitted in the big-O notation. Since the dimension of the parameterized space of terminal cost functions is an important block of the proposed methods, and its dimension can be large,
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Gaussian Processes and Bayesian Inference · Simulation Techniques and Applications
