Asynchronous Distributed Reinforcement Learning for LQR Control via   Zeroth-Order Block Coordinate Descent

Gangshan Jing; He Bai; Jemin George; Aranya Chakrabortty; Piyush K.; Sharma

arXiv:2107.12416·eess.SY·May 6, 2024

Asynchronous Distributed Reinforcement Learning for LQR Control via Zeroth-Order Block Coordinate Descent

Gangshan Jing, He Bai, Jemin George, Aranya Chakrabortty, Piyush K., Sharma

PDF

Open Access

TL;DR

This paper introduces an asynchronous distributed zeroth-order optimization algorithm tailored for reinforcement learning in large-scale networks, reducing variance and eliminating the need for global consensus, with applications to distributed LQR control.

Contribution

It presents a novel distributed zeroth-order algorithm leveraging network structure for local gradient estimation without consensus, suitable for non-convex stochastic optimization and RL.

Findings

01

Algorithm achieves lower variance compared to centralized methods.

02

Demonstrates effective convergence in distributed LQR control.

03

Operates asynchronously without global consensus protocols.

Abstract

Recently introduced distributed zeroth-order optimization (ZOO) algorithms have shown their utility in distributed reinforcement learning (RL). Unfortunately, in the gradient estimation process, almost all of them require random samples with the same dimension as the global variable and/or require evaluation of the global cost function, which may induce high estimation variance for large-scale networks. In this paper, we propose a novel distributed zeroth-order algorithm by leveraging the network structure inherent in the optimization objective, which allows each agent to estimate its local gradient by local cost evaluation independently, without use of any consensus protocol. The proposed algorithm exhibits an asynchronous update scheme, and is designed for stochastic non-convex optimization with a possibly non-convex feasible domain based on the block coordinate descent method. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed Control Multi-Agent Systems · Adaptive Dynamic Programming Control · Neural Networks Stability and Synchronization