Exploration Policies for On-the-Fly Controller Synthesis: A   Reinforcement Learning Approach

Tom\'as Delgado; Marco S\'anchez Sorondo; V\'ictor Braberman,; Sebasti\'an Uchitel

arXiv:2210.05393·cs.LG·May 5, 2023

Exploration Policies for On-the-Fly Controller Synthesis: A Reinforcement Learning Approach

Tom\'as Delgado, Marco S\'anchez Sorondo, V\'ictor Braberman,, Sebasti\'an Uchitel

PDF

Open Access 1 Repo

TL;DR

This paper introduces a reinforcement learning-based heuristic for on-the-fly controller synthesis, enabling efficient, generalized strategy generation in non-deterministic environments without exhaustive exploration.

Contribution

It proposes a novel RL approach with a modified DQN to learn heuristics that generalize to larger problem instances, improving over domain-independent heuristics.

Findings

01

RL-based heuristics outperform existing heuristics in unseen instances

02

Heuristics learned on small problems generalize to larger instances

03

The approach enables zero-shot policy transfer in controller synthesis

Abstract

Controller synthesis is in essence a case of model-based planning for non-deterministic environments in which plans (actually ''strategies'') are meant to preserve system goals indefinitely. In the case of supervisory control environments are specified as the parallel composition of state machines and valid strategies are required to be ''non-blocking'' (i.e., always enabling the environment to reach certain marked states) in addition to safe (i.e., keep the system within a safe zone). Recently, On-the-fly Directed Controller Synthesis techniques were proposed to avoid the exploration of the entire -and exponentially large-environment space, at the cost of non-maximal permissiveness, to either find a strategy or conclude that there is none. The incremental exploration of the plant is currently guided by a domain-independent human-designed heuristic. In this work, we propose a new method…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tdelgado00/learning-synthesis
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFormal Methods in Verification · Adversarial Robustness in Machine Learning · Reinforcement Learning in Robotics

MethodsConvolution · Dense Connections · Q-Learning · Deep Q-Network