TL;DR
This paper investigates the prophet inequality problem under noisy observations and unknown reward distributions, proposing algorithms that combine learning and decision-making to achieve near-optimal competitive ratios.
Contribution
It introduces algorithms that integrate learning and decision-making in noisy prophet inequalities, achieving sharp competitive ratios in various settings.
Findings
Explore-then-Decide and ε-Greedy strategies attain a 1 - 1/e competitive ratio in i.i.d. settings.
A 1/2 competitive ratio is achievable for non-identical distributions.
Limited window access still guarantees a 1/2 competitive ratio against the optimal benchmark.
Abstract
We study the prophet inequality, a fundamental problem in online decision-making and optimal stopping, in a practical setting where rewards are observed only through noisy realizations and reward distributions are unknown. At each stage, the decision-maker receives a noisy reward whose true value follows a linear model with an unknown latent parameter, and observes a feature vector drawn from a distribution. To address this challenge, we propose algorithms that integrate learning and decision-making via lower-confidence-bound (LCB) thresholding. In the i.i.d.\ setting, we establish that both an Explore-then-Decide strategy and an -Greedy variant achieve the sharp competitive ratio of , under a mild condition on the optimal value. For non-identical distributions, we show that a competitive ratio of can be guaranteed against a relaxed benchmark. Moreover, with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
