Adaptive Policies for Sequential Sampling under Incomplete Information   and a Cost Constraint

Apostolos Burnetas; Odysseas Kanavetas

arXiv:1201.4002·stat.ML·January 20, 2012

Adaptive Policies for Sequential Sampling under Incomplete Information and a Cost Constraint

Apostolos Burnetas, Odysseas Kanavetas

PDF

Open Access

TL;DR

This paper develops adaptive policies for sequential sampling from multiple populations to maximize long-term average outcomes under cost constraints, ensuring convergence to optimal values even with unknown distributions.

Contribution

It introduces a class of consistent adaptive policies that guarantee convergence to the true optimal outcome under incomplete information and cost constraints.

Findings

01

Policies achieve almost sure convergence to the true mean outcomes.

02

Simulation shows different policies have varying convergence rates.

03

The approach effectively balances exploration and exploitation under cost limits.

Abstract

We consider the problem of sequential sampling from a finite number of independent statistical populations to maximize the expected infinite horizon average outcome per period, under a constraint that the expected average sampling cost does not exceed an upper bound. The outcome distributions are not known. We construct a class of consistent adaptive policies, under which the average outcome converges with probability 1 to the true value under complete information for all distributions with finite means. We also compare the rate of convergence for various policies in this class using simulation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHealthcare Operations and Scheduling Optimization · Advanced Causal Inference Techniques · Statistical Methods and Bayesian Inference