Risk-sensitive Actor-free Policy via Convex Optimization

Ruoqi Zhang; Jens Sj\"olund

arXiv:2307.00141·cs.LG·July 4, 2023

Risk-sensitive Actor-free Policy via Convex Optimization

Ruoqi Zhang, Jens Sj\"olund

PDF

Open Access

TL;DR

This paper introduces a risk-sensitive, actor-free reinforcement learning policy using convex optimization and neural networks to ensure safety and global optimality in decision-making.

Contribution

It proposes a novel actor-free policy framework that employs input-convex neural networks for risk-sensitive optimization in reinforcement learning.

Findings

01

Effective risk control demonstrated in experiments

02

Convex neural network ensures global optimality

03

Simplifies policy optimization process

Abstract

Traditional reinforcement learning methods optimize agents without considering safety, potentially resulting in unintended consequences. In this paper, we propose an optimal actor-free policy that optimizes a risk-sensitive criterion based on the conditional value at risk. The risk-sensitive objective function is modeled using an input-convex neural network ensuring convexity with respect to the actions and enabling the identification of globally optimal actions through simple gradient-following methods. Experimental results demonstrate the efficacy of our approach in maintaining effective risk control.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI)