Reinforcement learning for irreversible reinsurance problems: the randomized singular control approach

Zongxia Liang; Xiaodong Luo; Xiang Yu

arXiv:2512.02769·math.OC·December 3, 2025

Reinforcement learning for irreversible reinsurance problems: the randomized singular control approach

Zongxia Liang, Xiaodong Luo, Xiang Yu

PDF

Open Access

TL;DR

This paper develops a novel reinforcement learning framework for continuous-time stochastic singular control problems, specifically applied to irreversible reinsurance, introducing randomization techniques and actor-critic algorithms to improve exploration and convergence.

Contribution

It introduces a new randomized singular control approach with entropy regularization and develops actor-critic algorithms for unknown model coefficients in irreversible reinsurance problems.

Findings

01

The proposed method achieves superior convergence in numerical experiments.

02

Randomization enhances exploration and learning performance.

03

The approach effectively handles time-inconsistency in the control problem.

Abstract

This paper studies the continuous-time reinforcement learning for stochastic singular control with the application to an infinite-horizon irreversible reinsurance problems. The singular control is equivalently characterized as a pair of regions of time and the augmented states, called the singular control law. To encourage the exploration in the learning procedure, we propose a randomization method for the singular control laws, new to the literature, by considering an auxiliary singular control and entropy regularization. The exploratory singular control problem is formulated as a two-stage optimal control problem, where the time-inconsistency issue arises in the outer problem. In the specific model setup with known model coefficients, we provide the full characterization of the time-consistent equilibrium singular controls for the two-stage problem. Taking advantage of the solution…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Reinforcement Learning in Robotics · Optimization and Variational Analysis