Reinforcement Learning for Discrete-time LQG Mean Field Social Control Problems with Unknown Dynamics

Hanfang Zhang; Bing-Chang Wang; Shuo Chen

arXiv:2507.01420·math.OC·December 5, 2025

Reinforcement Learning for Discrete-time LQG Mean Field Social Control Problems with Unknown Dynamics

Hanfang Zhang, Bing-Chang Wang, Shuo Chen

PDF

Open Access

TL;DR

This paper develops reinforcement learning algorithms to solve discrete-time LQG mean field social control problems with unknown dynamics, addressing the challenges posed by agent interactions and system coupling.

Contribution

It introduces both model-based and model-free RL algorithms for unknown dynamics in mean field control, with proven convergence and practical data-driven updates.

Findings

01

Model-based policy iteration converges under stabilizability and detectability.

02

The model-free RL algorithm effectively approximates optimal control using data from agents.

03

Numerical results verify the algorithms' effectiveness in complex scenarios.

Abstract

This paper studies the discrete-time linear-quadratic-Gaussian mean field (MF) social control problem in an infinite horizon, where the dynamics of all agents are unknown. The objective is to design a reinforcement learning (RL) algorithm to approximate the decentralized asymptotic optimal social control in terms of two algebraic Riccati equations (AREs). In this problem, a coupling term is introduced into the system dynamics to capture the interactions among agents. This causes the equivalence between model-based and model-free methods to be invalid, which makes it difficult to directly apply traditional model-free algorithms. Firstly, under the assumptions of system stabilizability and detectability, a model-based policy iteration algorithm is proposed to approximate the stabilizing solution of the AREs. The algorithm is proven to be convergent in both cases of semi-positive definite…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Reinforcement Learning in Robotics · Distributed Control Multi-Agent Systems