Reinforcement Learning in Deep Structured Teams: Initial Results with   Finite and Infinite Valued Features

Jalal Arabneydi; Masoud Roudneshin; Amir G. Aghdam

arXiv:2010.02868·cs.MA·February 9, 2021

Reinforcement Learning in Deep Structured Teams: Initial Results with Finite and Infinite Valued Features

Jalal Arabneydi, Masoud Roudneshin, Amir G. Aghdam

PDF

TL;DR

This paper explores reinforcement learning algorithms for deep structured teams with Markov chain and linear quadratic models, addressing both known and unknown models, and demonstrates their application to smart grid management.

Contribution

It introduces reinforcement learning methods for deep structured teams with finite and infinite valued features, including convergence proofs and practical application to smart grids.

Findings

01

Proposed RL algorithms converge for both Markov chain and linear quadratic models.

02

Algorithms effectively handle models with incomplete information.

03

Application to smart grid demonstrates practical utility.

Abstract

In this paper, we consider Markov chain and linear quadratic models for deep structured teams with discounted and time-average cost functions under two non-classical information structures, namely, deep state sharing and no sharing. In deep structured teams, agents are coupled in dynamics and cost functions through deep state, where deep state refers to a set of orthogonal linear regressions of the states. In this article, we consider a homogeneous linear regression for Markov chain models (i.e., empirical distribution of states) and a few orthonormal linear regressions for linear quadratic models (i.e., weighted average of states). Some planning algorithms are developed for the case when the model is known, and some reinforcement learning algorithms are proposed for the case when the model is not known completely. The convergence of two model-free (reinforcement learning) algorithms,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.