Compositional Reinforcement Learning for Discrete-Time Stochastic Control Systems
Abolfazl Lavaei, Mateo Perez, Milad Kazemi, Fabio Somenzi, Sadegh, Soudjani, Ashutosh Trivedi, Majid Zamani

TL;DR
This paper introduces a compositional, model-free reinforcement learning method for synthesizing policies in networks of stochastic control systems with unknown dynamics, providing guarantees on overall system satisfaction probabilities.
Contribution
It develops a novel compositional approach using assume-guarantee reasoning and adversarial RL to handle unknown dynamics in continuous-space systems with formal temporal logic specifications.
Findings
Effective policy synthesis demonstrated on temperature regulation network
Successful control of traffic network with probabilistic guarantees
Accelerated learning via potential-based reward shaping
Abstract
We propose a compositional approach to synthesize policies for networks of continuous-space stochastic control systems with unknown dynamics using model-free reinforcement learning (RL). The approach is based on implicitly abstracting each subsystem in the network with a finite Markov decision process with unknown transition probabilities, synthesizing a strategy for each abstract model in an assume-guarantee fashion using RL, and then mapping the results back over the original network with approximate optimality guarantees. We provide lower bounds on the satisfaction probability of the overall network based on those over individual subsystems. A key contribution is to leverage the convergence results for adversarial RL (minimax Q-learning) on finite stochastic arenas to provide control strategies maximizing the probability of satisfaction over the network of continuous-space systems.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCardiac electrophysiology and arrhythmias
