Exploratory mean-variance portfolio selection with Choquet regularizers
Junyi Guo, Xia Han, Hao Wang

TL;DR
This paper introduces a novel continuous-time exploratory mean-variance portfolio optimization framework using Choquet regularizers to quantify exploration, deriving explicit solutions and demonstrating their effectiveness through reinforcement learning simulations.
Contribution
It develops a new RL-based EMV model with Choquet regularizers, providing explicit solutions and exploring different regularizer choices for exploration strategies.
Findings
Optimal distributions form a location-scale family influenced by Choquet regularizers
Explicit solutions for specific Choquet regularizers like exponential, uniform, and Gaussian
RL simulations compare different Choquet regularizer-based exploration strategies
Abstract
In this paper, we study a continuous-time exploratory mean-variance (EMV) problem under the framework of reinforcement learning (RL), and the Choquet regularizers are used to measure the level of exploration. By applying the classical Bellman principle of optimality, the Hamilton-Jacobi-Bellman equation of the EMV problem is derived and solved explicitly via maximizing statically a mean-variance constrained Choquet regularizer. In particular, the optimal distributions form a location-scale family, whose shape depends on the choices of the Choquet regularizer. We further reformulate the continuous-time Choquet-regularized EMV problem using a variant of the Choquet regularizer. Several examples are given under specific Choquet regularizers that generate broadly used exploratory samplers such as exponential, uniform and Gaussian. Finally, we design a RL algorithm to simulate and compare…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Distributed Sensor Networks and Detection Algorithms · Auction Theory and Applications
