A Mean Field Approach for Optimization in Particles Systems and   Applications

Nicolas Gast (INRIA Rh\^one-Alpes / LIG laboratoire d'Informatique de; Grenoble); Bruno Gaujal (INRIA Rh\^one-Alpes / LIG laboratoire d'Informatique; de Grenoble)

arXiv:0903.2352·math.PR·June 10, 2009

A Mean Field Approach for Optimization in Particles Systems and Applications

Nicolas Gast (INRIA Rh\^one-Alpes / LIG laboratoire d'Informatique de, Grenoble), Bruno Gaujal (INRIA Rh\^one-Alpes / LIG laboratoire d'Informatique, de Grenoble)

PDF

Open Access

TL;DR

This paper develops a mean field approach to analyze the asymptotic behavior of large particle-based Markov decision processes, demonstrating convergence of optimal costs and policies, and applying the results to grid computing optimization.

Contribution

It introduces a mean field framework for MDPs with many particles, proving convergence of costs and policies, and provides explicit optimal policies for large systems with applications.

Findings

01

Optimal costs converge to a deterministic limit as particles grow large

02

Explicit formulas for the variance in the convergence speed are derived

03

The mean field optimal policy outperforms classical policies in large-scale simulations

Abstract

This paper investigates the limit behavior of Markov Decision Processes (MDPs) made of independent particles evolving in a common environment, when the number of particles goes to infinity. In the finite horizon case or with a discounted cost and an infinite horizon, we show that when the number of particles becomes large, the optimal cost of the system converges almost surely to the optimal cost of a discrete deterministic system (the ``optimal mean field''). Convergence also holds for optimal policies. We further provide insights on the speed of convergence by proving several central limits theorems for the cost and the state of the Markov decision process with explicit formulas for the variance of the limit Gaussian laws. Then, our framework is applied to a brokering problem in grid computing. The optimal policy for the limit deterministic system is computed explicitly. Several…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Markov Chains and Monte Carlo Methods · Optimization and Search Problems