STL-Based Synthesis of Feedback Controllers Using Reinforcement Learning
Nikhil Kumar Singh, Indranil Saha

TL;DR
This paper introduces a novel STL-based reward mechanism for reinforcement learning, enabling the synthesis of feedback controllers for complex systems with temporal logic specifications, validated through continuous control benchmarks.
Contribution
It proposes a new quantitative semantics for STL tailored for reward generation in RL, improving controller synthesis for complex dynamical systems.
Findings
The new STL semantics outperform existing ones in controller synthesis efficacy.
Experimental validation on continuous control benchmarks demonstrates improved performance.
The approach effectively encodes complex temporal specifications into RL rewards.
Abstract
Deep Reinforcement Learning (DRL) has the potential to be used for synthesizing feedback controllers (agents) for various complex systems with unknown dynamics. These systems are expected to satisfy diverse safety and liveness properties best captured using temporal logic. In RL, the reward function plays a crucial role in specifying the desired behaviour of these agents. However, the problem of designing the reward function for an RL agent to satisfy complex temporal logic specifications has received limited attention in the literature. To address this, we provide a systematic way of generating rewards in real-time by using the quantitative semantics of Signal Temporal Logic (STL), a widely used temporal logic to specify the behaviour of cyber-physical systems. We propose a new quantitative semantics for STL having several desirable properties, making it suitable for reward generation.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsFormal Methods in Verification · Advanced Software Engineering Methodologies
