On optimal ordering in the optimal stopping problem
Shipra Agrawal, Jay Sethuraman, Xingyu Zhang

TL;DR
This paper explores how choosing the order of observing random variables in an optimal stopping problem can significantly improve expected rewards, providing algorithms and complexity results for specific cases.
Contribution
It introduces the first analysis of order selection in optimal stopping, including a prophet inequality with improved bounds and algorithms for certain cases.
Findings
Optimal ordering achieves a 1.25-approximation of the hindsight maximum with support size 2.
A simple $O(n^2)$ algorithm finds the optimal order for support size 2.
The problem becomes NP-hard with support size 3, but admits an FPTAS.
Abstract
In the classical optimal stopping problem, a player is given a sequence of random variables with known distributions. After observing the realization of , the player can either accept the observed reward from and stop, or reject the observed reward from and continue to observe the next variable in the sequence. Under any fixed ordering of the random variables, an optimal stopping policy, one that maximizes the player's expected reward, is given by the solution of a simple dynamic program. In this paper, we investigate the relatively less studied question of selecting the order in which the random variables should be observed so as to maximize the expected reward at the stopping time. To demonstrate the benefits of order selection, we prove a novel prophet inequality showing that, when the support of each random variable has size at most 2, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Search Problems · Auction Theory and Applications · Reinforcement Learning in Robotics
