Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of   Representation Learning in Actor-Critic

Yufeng Zhang; Siyu Chen; Zhuoran Yang; Michael I. Jordan; Zhaoran Wang

arXiv:2112.13530·cs.LG·April 2, 2024

Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic

Yufeng Zhang, Siyu Chen, Zhuoran Yang, Michael I. Jordan, Zhaoran Wang

PDF

Open Access 1 Video

TL;DR

This paper provides a mean-field theoretical analysis of neural actor-critic algorithms, showing their convergence to optimal policies and the evolution of feature representations in an overparameterized neural network setting.

Contribution

It introduces a mean-field framework for neural actor-critic algorithms, demonstrating convergence and feature evolution in the infinite-width, continuous-time limit.

Findings

01

Neural AC converges to the globally optimal policy at a sublinear rate.

02

Feature representations evolve within a neighborhood of the initial features.

03

The analysis applies to overparameterized two-layer neural networks with two-timescale updates.

Abstract

Actor-critic (AC) algorithms, empowered by neural networks, have had significant empirical success in recent years. However, most of the existing theoretical support for AC algorithms focuses on the case of linear function approximations, or linearized neural networks, where the feature representation is fixed throughout training. Such a limitation fails to capture the key aspect of representation learning in neural AC, which is pivotal in practical problems. In this work, we take a mean-field perspective on the evolution and convergence of feature-based neural AC. Specifically, we consider a version of AC where the actor and critic are represented by overparameterized two-layer neural networks and are updated with two-timescale learning rates. The critic is updated by temporal-difference (TD) learning with a larger stepsize while the actor is updated via proximal policy optimization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning