Reinforcement Learning in Deep Structured Teams: Initial Results with Finite and Infinite Valued Features
Jalal Arabneydi, Masoud Roudneshin, Amir G. Aghdam

TL;DR
This paper explores reinforcement learning algorithms for deep structured teams with Markov chain and linear quadratic models, addressing both known and unknown models, and demonstrates their application to smart grid management.
Contribution
It introduces reinforcement learning methods for deep structured teams with finite and infinite valued features, including convergence proofs and practical application to smart grids.
Findings
Proposed RL algorithms converge for both Markov chain and linear quadratic models.
Algorithms effectively handle models with incomplete information.
Application to smart grid demonstrates practical utility.
Abstract
In this paper, we consider Markov chain and linear quadratic models for deep structured teams with discounted and time-average cost functions under two non-classical information structures, namely, deep state sharing and no sharing. In deep structured teams, agents are coupled in dynamics and cost functions through deep state, where deep state refers to a set of orthogonal linear regressions of the states. In this article, we consider a homogeneous linear regression for Markov chain models (i.e., empirical distribution of states) and a few orthonormal linear regressions for linear quadratic models (i.e., weighted average of states). Some planning algorithms are developed for the case when the model is known, and some reinforcement learning algorithms are proposed for the case when the model is not known completely. The convergence of two model-free (reinforcement learning) algorithms,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
