Unified continuous-time q-learning for mean-field game and mean-field control problems
Xiaoli Wei, Xiang Yu, Fengyi Yuan

TL;DR
This paper introduces a unified continuous-time q-learning framework for mean-field game and control problems in jump-diffusion models, enabling policy evaluation and learning without direct population distribution observation.
Contribution
It proposes the decoupled Iq-function with martingale characterization, unifying policy evaluation for MFG and MFC, and develops a novel q-learning algorithm applicable to financial jump-diffusion models.
Findings
Successful application to financial jump-diffusion models
Exact parameterization of Iq-functions and value functions
Satisfactory performance of the proposed q-learning algorithm
Abstract
This paper studies the continuous-time q-learning in mean-field jump-diffusion models when the population distribution is not directly observable. We propose the integrated q-function in decoupled form (decoupled Iq-function) from the representative agent's perspective and establish its martingale characterization, which provides a unified policy evaluation rule for both mean-field game (MFG) and mean-field control (MFC) problems. Moreover, we consider the learning procedure where the representative agent updates the population distribution based on his own state values. Depending on the task to solve the MFG or MFC problem, we can employ the decoupled Iq-function differently to characterize the mean-field equilibrium policy or the mean-field optimal policy respectively. Based on these theoretical findings, we devise a unified q-learning algorithm for both MFG and MFC problems by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGuidance and Control Systems
MethodsQ-Learning
