A PAC algorithm in relative precision for bandit problem with costly sampling
Marie Billaud-Friess, Arthur Macherey, Anthony Nouy and, Cl\'ementine Prieur

TL;DR
This paper introduces PAC algorithms for finite-armed bandit problems that achieve relative precision with high probability, focusing on reducing sampling costs in applications where sampling is expensive.
Contribution
It presents a naive and an adaptive PAC algorithm for discrete optimization, with the adaptive method reducing sample complexity and suited for costly sampling scenarios.
Findings
Adaptive algorithm outperforms naive in sample efficiency
Both algorithms achieve relative precision with high probability
Adaptive method is especially effective for high-cost sampling applications
Abstract
This paper considers the problem of maximizing an expectation function over a finite set, or finite-arm bandit problem. We first propose a naive stochastic bandit algorithm for obtaining a probably approximately correct (PAC) solution to this discrete optimization problem in relative precision, that is a solution which solves the optimization problem up to a relative error smaller than a prescribed tolerance, with high probability. We also propose an adaptive stochastic bandit algorithm which provides a PAC-solution with the same guarantees. The adaptive algorithm outperforms the mean complexity of the naive algorithm in terms of number of generated samples and is particularly well suited for applications with high sampling cost.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Risk and Portfolio Optimization · Stochastic Gradient Optimization Techniques
