A Reinforcement Learning Framework for Some Singular Stochastic Control Problems

Zongxia Liang; Xiaodong Luo; Xiang Yu

arXiv:2506.22203·math.OC·May 14, 2026

A Reinforcement Learning Framework for Some Singular Stochastic Control Problems

Zongxia Liang, Xiaodong Luo, Xiang Yu

PDF

TL;DR

This paper introduces a continuous-time reinforcement learning framework for singular stochastic control problems, characterizing optimal controls as regions in time and state space, and devising q-learning algorithms for their identification.

Contribution

It generalizes policy evaluation theories to singular controls, introduces q-functions for model-free learning, and develops algorithms with theoretical guarantees.

Findings

01

Developed a policy improvement theorem via region iteration.

02

Established martingale characterization for q-functions and value function.

03

Presented numerical experiments demonstrating the proposed algorithms.

Abstract

We develop a continuous-time reinforcement learning framework for a class of singular stochastic control problems without entropy regularization. The optimal singular control is characterized as the optimal singular control law, which is a pair of regions of time and the augmented states. The goal of learning is to identify such an optimal region via the trial-and-error procedure. In this context, we generalize the existing policy evaluation theories with regular controls to learn our optimal singular control law and develop a policy improvement theorem via the region iteration. To facilitate the model-free policy iteration procedure, we further introduce the zero-order and first-order q-functions arising from singular control problems and establish the martingale characterization for the pair of q-functions together with the value function. Based on our theoretical findings, some…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.