Stochastic Policy Gradient Methods in the Uncertain Volatility Model
Lokman A Abbas-Turki (LPSM), Jean-Fran\c{c}ois Chassagneux (ENSAE Paris), Jean-Philippe Lemor, Gr\'egoire Loeper, Simon Sananes (LPSM)

TL;DR
This paper introduces a novel stochastic policy gradient method using neural networks and a specialized Gaussian policy for robust multidimensional option pricing under uncertainty.
Contribution
It develops a backward actor-critic scheme with a C-vine based Gaussian policy to efficiently solve high-dimensional robust option pricing problems.
Findings
Accurately prices multidimensional derivatives under volatility uncertainty.
Remains computationally efficient compared to existing methods.
Outperforms Monte Carlo and machine-learning benchmarks in robustness.
Abstract
The multidimensional Uncertain Volatility Model leads to robust option pricing problems under joint volatility and correlation uncertainty. Their numerical resolution quickly becomes challenging because the associated stochastic control problem is high-dimensional. We propose a backward actor-critic stochastic policy gradient scheme tailored to this setting. The method combines a discrete dynamic programming principle with Proximal Policy Optimization and shallow neural-network approximations of both the value function and the control policy. A key ingredient is the policy parameterization: continuous controls are represented through a squashed Gaussian policy built on a C-vine representation of correlation matrices, which enforces positive semidefiniteness by construction. Numerical experiments on a range of multidimensional derivatives show that the method yields accurate prices,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
