Irrevocable Multi-Armed Bandit Policies

Vivek Farias; Ritesh Madan

arXiv:0806.4133·math.OC·June 26, 2008·4 cites

Irrevocable Multi-Armed Bandit Policies

Vivek Farias, Ritesh Madan

PDF

Open Access

TL;DR

This paper introduces an 'irrevocable' heuristic for multi-armed bandit problems with multiple simultaneous pulls, balancing limited exploration with near-optimal performance, especially in coin-based bandit scenarios.

Contribution

The paper proposes a novel irrevocable heuristic for multi-armed bandits, providing theoretical bounds and demonstrating practical effectiveness with minimal exploration costs.

Findings

01

Irrevocable heuristic achieves up to 10% loss compared to optimal policies.

02

Expected rewards are within a factor of 1/8 of the unrestricted optimal.

03

The heuristic is robust across various problem parameters.

Abstract

This paper considers the multi-armed bandit problem with multiple simultaneous arm pulls. We develop a new `irrevocable' heuristic for this problem. In particular, we do not allow recourse to arms that were pulled at some point in the past but then discarded. This irrevocable property is highly desirable from a practical perspective. As a consequence of this property, our heuristic entails a minimum amount of `exploration'. At the same time, we find that the price of irrevocability is limited for a broad useful class of bandits we characterize precisely. This class includes one of the most common applications of the bandit model, namely, bandits whose arms are `coins' of unknown biases. Computational experiments with a generative family of large scale problems within this class indicate losses of up to 5 to 10% relative to an upper bound on the performance of an optimal policy with no…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Auction Theory and Applications