Optimal Investment and Entropy-Regularized Learning Under Stochastic Volatility Models with Portfolio Constraints
Thai Nguyen, Pertiny Nkuize

TL;DR
This paper develops a continuous-time reinforcement learning framework for optimal portfolio selection under stochastic volatility, employing entropy regularization and PDE analysis to derive implementable policies.
Contribution
It introduces a novel entropy-regularized control approach with PDE-based analysis for portfolio optimization under stochastic volatility with constraints.
Findings
Derived the entropy-regularized HJB equation for portfolio control.
Proved existence of classical solutions under certain conditions.
Provided a PDE-based interpretation of actor-critic learning dynamics.
Abstract
We study the problem of optimal portfolio selection under stochastic volatility within a continuous time reinforcement learning framework with portfolio constraints. Exploration is modeled through entropy-regularized relaxed controls, where the investor selects probability distributions over admissible portfolio allocations rather than deterministic strategies. Using dynamic programming arguments, we derive the associated entropy-regularized Hamilton-Jacobi-Bellman equation, whose Hamiltonian involves optimization over probability measures supported on a compact control set. We show that the optimal exploratory policy takes the form of a truncated Gaussian distribution characterized by spatial derivatives of the solution of the resulting nonlinear quasilinear parabolic partial differential equation. Under suitable structural conditions on the model coefficients, we prove the existence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
