Distributed Reinforcement Learning for Decentralized Linear Quadratic   Control: A Derivative-Free Policy Optimization Approach

Yingying Li; Yujie Tang; Runyu Zhang; Na Li

arXiv:1912.09135·eess.SY·October 26, 2020

Distributed Reinforcement Learning for Decentralized Linear Quadratic Control: A Derivative-Free Policy Optimization Approach

Yingying Li, Yujie Tang, Runyu Zhang, Na Li

PDF

1 Repo

TL;DR

This paper introduces ZODPO, a distributed reinforcement learning algorithm for decentralized linear quadratic control that efficiently learns stabilizing controllers with limited communication, suitable for large-scale systems.

Contribution

It proposes a novel zero-order distributed policy optimization method that combines policy gradient, consensus, and zero-order techniques for decentralized control.

Findings

01

ZODPO achieves polynomial sample complexity for near-stationary solutions.

02

Controllers learned are stabilizing with high probability.

03

Algorithm performs well on multi-zone HVAC system simulations.

Abstract

This paper considers a distributed reinforcement learning problem for decentralized linear quadratic control with partial state observations and local costs. We propose a Zero-Order Distributed Policy Optimization algorithm (ZODPO) that learns linear local controllers in a distributed fashion, leveraging the ideas of policy gradient, zero-order optimization and consensus algorithms. In ZODPO, each agent estimates the global cost by consensus, and then conducts local policy gradient in parallel based on zero-order gradient estimation. ZODPO only requires limited communication and storage even in large-scale systems. Further, we investigate the nonasymptotic performance of ZODPO and show that the sample complexity to approach a stationary point is polynomial with the error tolerance's inverse and the problem dimensions, demonstrating the scalability of ZODPO. We also show that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

DianYu420376/multi-agent-RL-numerical-simulation
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTest