Safe Deep Reinforcement Learning for Resource Allocation with Peak Age of Information Violation Guarantees
Berire Gunes Reyhan, Sinem Coleri

TL;DR
This paper introduces a novel safe deep reinforcement learning framework for wireless networked control systems that guarantees peak age of information constraints while optimizing power consumption.
Contribution
It combines optimization theory with safe DRL to ensure constraint satisfaction and improve resource allocation in ultra-reliable wireless control systems.
Findings
Outperforms rule-based and other DRL benchmarks in simulations.
Achieves faster convergence and higher rewards.
Ensures constraint compliance with a teacher-student DRL framework.
Abstract
In Wireless Networked Control Systems (WNCSs), control and communication systems must be co-designed due to their strong interdependence. This paper presents a novel optimization theory-based safe deep reinforcement learning (DRL) framework for ultra-reliable WNCSs, ensuring constraint satisfaction while optimizing performance, for the first time in the literature. The approach minimizes power consumption under key constraints, including Peak Age of Information (PAoI) violation probability, transmit power, and schedulability in the finite blocklength regime. PAoI violation probability is uniquely derived by combining stochastic maximum allowable transfer interval (MATI) and maximum allowable packet delay (MAD) constraints in a multi-sensor network. The framework consists of two stages: optimization theory and safe DRL. The first stage derives optimality conditions to establish…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
