Compositional Reinforcement Learning for Discrete-Time Stochastic   Control Systems

Abolfazl Lavaei; Mateo Perez; Milad Kazemi; Fabio Somenzi; Sadegh; Soudjani; Ashutosh Trivedi; Majid Zamani

arXiv:2208.03485·eess.SY·August 9, 2022

Compositional Reinforcement Learning for Discrete-Time Stochastic Control Systems

Abolfazl Lavaei, Mateo Perez, Milad Kazemi, Fabio Somenzi, Sadegh, Soudjani, Ashutosh Trivedi, Majid Zamani

PDF

Open Access

TL;DR

This paper introduces a compositional, model-free reinforcement learning method for synthesizing policies in networks of stochastic control systems with unknown dynamics, providing guarantees on overall system satisfaction probabilities.

Contribution

It develops a novel compositional approach using assume-guarantee reasoning and adversarial RL to handle unknown dynamics in continuous-space systems with formal temporal logic specifications.

Findings

01

Effective policy synthesis demonstrated on temperature regulation network

02

Successful control of traffic network with probabilistic guarantees

03

Accelerated learning via potential-based reward shaping

Abstract

We propose a compositional approach to synthesize policies for networks of continuous-space stochastic control systems with unknown dynamics using model-free reinforcement learning (RL). The approach is based on implicitly abstracting each subsystem in the network with a finite Markov decision process with unknown transition probabilities, synthesizing a strategy for each abstract model in an assume-guarantee fashion using RL, and then mapping the results back over the original network with approximate optimality guarantees. We provide lower bounds on the satisfaction probability of the overall network based on those over individual subsystems. A key contribution is to leverage the convergence results for adversarial RL (minimax Q-learning) on finite stochastic arenas to provide control strategies maximizing the probability of satisfaction over the network of continuous-space systems.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCardiac electrophysiology and arrhythmias