Logic-informed reinforcement learning for cross-domain optimization of large-scale cyber-physical systems

Guangxi Wan; Peng Zeng; Xiaoting Dong; Chunhe Song; Shijie Cui; Dong Li; Qingwei Dong; Yiyang Liu; Hongfei Bai

arXiv:2511.00806·cs.LG·November 4, 2025

Logic-informed reinforcement learning for cross-domain optimization of large-scale cyber-physical systems

Guangxi Wan, Peng Zeng, Xiaoting Dong, Chunhe Song, Shijie Cui, Dong Li, Qingwei Dong, Yiyang Liu, Hongfei Bai

PDF

Open Access

TL;DR

This paper introduces logic-informed reinforcement learning (LIRL), a method that guarantees constraint satisfaction and improves optimization in large-scale cyber-physical systems by integrating logic-based projections into policy-gradient algorithms.

Contribution

LIRL is a novel approach that incorporates on-the-fly logic-based projections into reinforcement learning, ensuring feasibility and safety without reward penalty tuning.

Findings

01

LIRL outperforms existing methods in manufacturing, EV charging, and traffic control.

02

Achieves up to 44.33% reduction in combined makespan-energy.

03

Maintains zero constraint violations across all tested scenarios.

Abstract

Cyber-physical systems (CPS) require the joint optimization of discrete cyber actions and continuous physical parameters under stringent safety logic constraints. However, existing hierarchical approaches often compromise global optimality, whereas reinforcement learning (RL) in hybrid action spaces often relies on brittle reward penalties, masking, or shielding and struggles to guarantee constraint satisfaction. We present logic-informed reinforcement learning (LIRL), which equips standard policy-gradient algorithms with projection that maps a low-dimensional latent action onto the admissible hybrid manifold defined on-the-fly by first-order logic. This guarantees feasibility of every exploratory step without penalty tuning. Experimental evaluations have been conducted across multiple scenarios, including industrial manufacturing, electric vehicle charging stations, and traffic signal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSmart Grid Security and Resilience · Reinforcement Learning in Robotics · Adaptive Dynamic Programming Control