Policy Gradients for Cumulative Prospect Theory in Reinforcement Learning

Olivier Lepel; Anas Barakat

arXiv:2410.02605·cs.LG·February 18, 2026

Policy Gradients for Cumulative Prospect Theory in Reinforcement Learning

Olivier Lepel, Anas Barakat

PDF

Open Access

TL;DR

This paper develops a policy gradient method for reinforcement learning with Cumulative Prospect Theory objectives, incorporating behavioral economic insights and providing convergence guarantees.

Contribution

It generalizes the policy gradient theorem to CPT, introduces a Monte Carlo estimator, and proves convergence for CPT-based RL algorithms.

Findings

01

The algorithm converges to stationary points of the CPT objective.

02

Simulations demonstrate qualitative behaviors of CPT in RL.

03

The method outperforms existing zeroth-order approaches.

Abstract

We derive a policy gradient theorem for Cumulative Prospect Theory (CPT) objectives in finite-horizon Reinforcement Learning (RL), generalizing the standard policy gradient theorem and encompassing distortion-based risk objectives as special cases. Motivated by behavioral economics, CPT combines an asymmetric utility transformation around a reference point with probability distortion. Building on our theorem, we design a first-order policy gradient algorithm for CPT-RL using a Monte Carlo gradient estimator based on order statistics. We establish statistical guarantees for the estimator and prove asymptotic convergence of the resulting algorithm to first-order stationary points of the (generally non-convex) CPT objective. Simulations illustrate qualitative behaviors induced by CPT and compare our first-order approach to existing zeroth-order methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCapital Investment and Risk Analysis · Auction Theory and Applications · Smart Grid Energy Management