Convergence of Neural Network Policies for Risk--Reward Optimization

Chang Chen; Duy-Minh Dang

arXiv:2603.06563·q-fin.CP·March 9, 2026

Convergence of Neural Network Policies for Risk--Reward Optimization

Chang Chen, Duy-Minh Dang

PDF

Open Access

TL;DR

This paper introduces a neural network framework for multi-period risk-reward stochastic control problems with constrained policies, proving convergence of the empirical solutions to the true optimum and validating with numerical experiments.

Contribution

It develops a novel neural network approach for risk-reward control problems with discontinuous policies, providing convergence guarantees and practical validation.

Findings

01

Convergence of neural network solutions to the true optimal control as capacity and data increase.

02

Close agreement between learned controls and reference solutions in heat map visualizations.

03

Demonstrated robustness of the learned policies on large independent scenario sets.

Abstract

We develop a neural-network framework for multi-period risk--reward stochastic control problems with constrained two-step feedback policies that may be discontinuous in the state. We allow a broad class of objectives built on a finite-dimensional performance vector, including terminal and path-dependent statistics, with risk functionals admitting auxiliary-variable optimization representations (e.g.\ Conditional Value-at-Risk and buffered probability of exceedance) and optional moment dependence. Our approach parametrizes the two-step policy using two coupled feedforward networks with constraint-enforcing output layers, reducing the constrained control problem to unconstrained training over network parameters. Under mild regularity conditions, we prove that the empirical optimum of the NN-parametrized objective converges in probability to the true optimal value as network capacity and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Risk and Portfolio Optimization · Adversarial Robustness in Machine Learning