Exploratory Optimal Stopping: A Singular Control Formulation

Jodi Dianetti; Giorgio Ferrari; Renyuan Xu

arXiv:2408.09335·math.OC·March 12, 2026

Exploratory Optimal Stopping: A Singular Control Formulation

Jodi Dianetti, Giorgio Ferrari, Renyuan Xu

PDF

Open Access

TL;DR

This paper formulates optimal stopping as a singular control problem with exploration incentives, introducing RL algorithms for solving it and providing guarantees for their performance.

Contribution

It presents a novel singular control formulation for exploratory optimal stopping and develops scalable RL algorithms with theoretical guarantees.

Findings

01

Regularized stopping problem encourages exploration.

02

Unique optimal strategy identified via dynamic programming.

03

Model-free RL method is scalable with neural networks.

Abstract

This paper explores continuous-time and state-space optimal stopping problems from a reinforcement learning perspective. We begin by formulating the stopping problem using randomized stopping times, where the decision maker's control is represented by the probability of stopping within a given time-specifically, a bounded, non-decreasing, c\`adl\`ag control process. To encourage exploration and facilitate learning, we introduce a regularized version of the problem by penalizing the performance criterion with the cumulative residual entropy of the randomized stopping time. The regularized problem takes the form of an (n+1)-dimensional degenerate singular stochastic control with finite-fuel, where the regularized free boundary becomes the graph of a function mapping the state variable of the original stopping problem into the probability of stopping. We address this singular control…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Optimization · Optimization and Search Problems

MethodsEntropy Regularization