Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration,   Convergence, and Stabilization

Ke Sun; Yafei Wang; Yi Liu; Yingnan Zhao; Bo Pan; Shangling Jui; Bei; Jiang; Linglong Kong

arXiv:2110.08896·cs.LG·October 22, 2021·1 cites

Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration, Convergence, and Stabilization

Ke Sun, Yafei Wang, Yi Liu, Yingnan Zhao, Bo Pan, Shangling Jui, Bei, Jiang, Linglong Kong

PDF

Open Access 1 Video

TL;DR

This paper provides a rigorous analysis of Anderson mixing in deep reinforcement learning, demonstrating its benefits for acceleration, convergence, and stability through theoretical insights and extensive experiments.

Contribution

It establishes a theoretical connection between Anderson mixing and quasi-Newton methods, and introduces stabilization strategies for improved deep RL performance.

Findings

01

Increases convergence radius of policy iteration schemes.

02

Enhances stability and performance of RL algorithms.

03

Provides a theoretical foundation for Anderson mixing in RL.

Abstract

Anderson mixing has been heuristically applied to reinforcement learning (RL) algorithms for accelerating convergence and improving the sampling efficiency of deep RL. Despite its heuristic improvement of convergence, a rigorous mathematical justification for the benefits of Anderson mixing in RL has not yet been put forward. In this paper, we provide deeper insights into a class of acceleration schemes built on Anderson mixing that improve the convergence of deep RL algorithms. Our main results establish a connection between Anderson mixing and quasi-Newton methods and prove that Anderson mixing increases the convergence radius of policy iteration schemes by an extra contraction factor. The key focus of the analysis roots in the fixed-point iteration nature of RL. We further propose a stabilization strategy by introducing a stable regularization term in Anderson mixing and a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration, Convergence, and Stabilization· slideslive

Taxonomy

TopicsModel Reduction and Neural Networks