Stochastic Semi-Gradient Descent for Learning Mean Field Games with Population-Aware Function Approximation
Chenyu Zhang, Xu Chen, Xuan Di

TL;DR
This paper introduces a novel stochastic gradient descent method, SemiSGD, for learning mean field games by treating policy and population as a unified parameter, enabling simultaneous updates and improved stability.
Contribution
It proposes the first population-aware linear function approximation for MFGs and provides finite-time convergence analysis for the method.
Findings
SemiSGD converges to MFG equilibrium under certain conditions.
PA-LFA effectively handles continuous state-action spaces.
Experimental results validate theoretical convergence and approximation properties.
Abstract
Mean field games (MFGs) model interactions in large-population multi-agent systems through population distributions. Traditional learning methods for MFGs are based on fixed-point iteration (FPI), where policy updates and induced population distributions are computed separately and sequentially. However, FPI-type methods may suffer from inefficiency and instability due to potential oscillations caused by this forward-backward procedure. In this work, we propose a novel perspective that treats the policy and population as a unified parameter controlling the game dynamics. By applying stochastic parameter approximation to this unified parameter, we develop SemiSGD, a simple stochastic gradient descent (SGD)-type method, where an agent updates its policy and population estimates simultaneously and fully asynchronously. Building on this perspective, we further apply linear function…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Gaussian Processes and Bayesian Inference
