Computing the Performance of A New Adaptive Sampling Algorithm Based on The Gittins Index in Experiments with Exponential Rewards
James K. He, Sof\'ia S. Villar, and Lida Mavrogonatou

TL;DR
This paper introduces a modified Gittins Index-based adaptive sampling algorithm tailored for exponential rewards, demonstrating improved earning and comparable learning in simulated multi-armed experiments, thus enhancing experimental efficiency.
Contribution
It presents the first adaptation of the Gittins Index for exponential rewards and evaluates its performance, showing benefits over traditional non-adaptive designs.
Findings
Better earning performance than non-adaptive designs
Comparable learning outcomes in simulated experiments
Potential for reducing experimental costs
Abstract
Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP that can simultaneously attain optimality and computationally efficiency goals, and it has been recently used in experiments with Bernoulli and Gaussian rewards. For the first time, we present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. We report its performance in simulated 2- armed and 3-armed experiments. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research
