Risk-sensitive Actor-free Policy via Convex Optimization
Ruoqi Zhang, Jens Sj\"olund

TL;DR
This paper introduces a risk-sensitive, actor-free reinforcement learning policy using convex optimization and neural networks to ensure safety and global optimality in decision-making.
Contribution
It proposes a novel actor-free policy framework that employs input-convex neural networks for risk-sensitive optimization in reinforcement learning.
Findings
Effective risk control demonstrated in experiments
Convex neural network ensures global optimality
Simplifies policy optimization process
Abstract
Traditional reinforcement learning methods optimize agents without considering safety, potentially resulting in unintended consequences. In this paper, we propose an optimal actor-free policy that optimizes a risk-sensitive criterion based on the conditional value at risk. The risk-sensitive objective function is modeled using an input-convex neural network ensuring convexity with respect to the actions and enabling the identification of globally optimal actions through simple gradient-following methods. Experimental results demonstrate the efficacy of our approach in maintaining effective risk control.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI)
