Actor critic learning algorithms for mean-field control with moment   neural networks

Huy\^en Pham; Xavier Warin

arXiv:2309.04317·stat.ML·September 11, 2023

Actor critic learning algorithms for mean-field control with moment neural networks

Huy\^en Pham, Xavier Warin

PDF

Open Access

TL;DR

This paper introduces a novel actor-critic reinforcement learning algorithm for mean-field control problems, utilizing moment neural networks on Wasserstein space to handle distribution trajectories and address mean-field specific operators.

Contribution

It presents a new policy gradient method with moment neural networks for continuous-time mean-field control, enabling direct sampling of distribution trajectories.

Findings

01

Effective in multi-dimensional settings

02

Handles nonlinear quadratic mean-field problems

03

Demonstrates convergence and robustness

Abstract

We develop a new policy gradient and actor-critic algorithm for solving mean-field control problems within a continuous time reinforcement learning setting. Our approach leverages a gradient-based representation of the value function, employing parametrized randomized policies. The learning for both the actor (policy) and critic (value function) is facilitated by a class of moment neural network functions on the Wasserstein space of probability measures, and the key feature is to sample directly trajectories of distributions. A central challenge addressed in this study pertains to the computational treatment of an operator specific to the mean-field framework. To illustrate the effectiveness of our methods, we provide a comprehensive set of numerical results. These encompass diverse examples, including multi-dimensional settings and nonlinear quadratic mean-field control problems with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Fluid Dynamics and Turbulent Flows · Adaptive Dynamic Programming Control