Action-Conditioned Risk Gating for Safety-Critical Control under Partial Observability
Yushen Liu, Yin-Jen Chen, Ziyi Chen, Tao Wang, Heng Huang, Xugui Zhou, Yanfu Zhang

TL;DR
This paper introduces a lightweight reinforcement learning method that uses action-conditioned risk predictions to improve safety and efficiency in partially observable, safety-critical control tasks.
Contribution
It proposes a novel risk-gated RL approach that constructs a compact proxy state and predicts near-term safety violations to guide decision-making under partial observability.
Findings
Improves glycemic control tradeoffs in glucose regulation tasks.
Reduces runtime compared to belief-space planning baseline.
Achieves better reward-cost balance in navigation benchmarks.
Abstract
Many safety-critical control problems are modeled as risk-sensitive partially observable Markov decision processes, where the controller must make decisions from incomplete observations while balancing task performance against safety risk. Although belief-space planning provides a principled solution, maintaining and planning over beliefs can be computationally costly and sensitive to model specification in practical domains. We propose a lightweight risk-gated reinforcement learning approximation for risk-sensitive control under partial observability. The method constructs a compact finite-history proxy state and learns an action-conditioned predictor of near-term safety violation. This predicted candidate-action risk is used in two complementary ways: as a risk penalty during value learning, and as a decision-time gate that interpolates between optimistic and conservative ensemble…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
