Reinforcement Learning for Dividend Optimization in Partially Observed Regime-Switching Diffusion Model
Zhongqin Gao, Yan Lv, Jingmin He

TL;DR
This paper develops a reinforcement learning approach to optimize dividend payouts in a partially observed market model with regime switching, using a novel exploratory control framework and numerical algorithms.
Contribution
It introduces a continuous-time RL method with an entropy-regularized control framework for dividend optimization under partial information and regime switching.
Findings
The RL algorithm effectively learns optimal dividend policies.
Numerical experiments show strong out-of-sample performance.
The approach provides semi-analytical characterizations of the value function and policy.
Abstract
This paper studies the optimal dividend problem with a bounded payout rate in a partially observed regime-switching diffusion model, where, in practice, the market regime is unobserved and key model parameters are unknown. To address this partial-information setting, we propose a continuous-time reinforcement learning (RL) approach within an exploratory (entropy-regularized) stochastic control framework for discounted dividends under regime switching. The associated exploratory Hamilton-Jacobi-Bellman (HJB) system admits semi-analytical characterizations of the value function and the optimal exploratory dividend policy, determined by two unknown functions solving two ordinary differential equations (ODEs) together with positive real roots of the induced quadratic equations. Exploiting this structure, we introduce parametric families for both the value function and the policy, using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and financial applications · stochastic dynamics and bifurcation · Probability and Risk Models
