# Exponential two-armed bandit problem

**Authors:** Alexander Kolnogorov, Denis Grunev

arXiv: 1908.05531 · 2019-08-16

## TL;DR

This paper analyzes the exponential two-armed bandit problem using Bayesian methods, deriving strategies and risks, and compares it with the Gaussian case, revealing similar limiting behaviors and implications for batch processing.

## Contribution

It develops a Bayesian approach for exponential two-armed bandits and derives a PDE for the limiting case, showing equivalence with Gaussian bandits in the limit.

## Key findings

- Exponential and Gaussian bandits have the same description in the limit.
- Batch processing does not increase Bayesian risk compared to individual processing as data size grows.
- Derived recursive Bayesian strategies and risks for exponential bandits.

## Abstract

We consider exponential two-armed bandit problem in which incomes are described by exponential distribution densities. We develop Bayesian approach and present recursive equation for determination of Bayesian strategy and Bayesian risk. In the limiting case as the control horizon goes to infinity, we obtain the second order partial differential equation in the domain of "close distributions". Results are compared with Gaussian two-armed bandit. It turned out that exponential and Gaussian two-armed bandits have the same description in the limiting case. Since Gaussian two-armed bandit describes the batch processing, this means that in case of exponential two-armed bandit batch processing does not enlarge Bayesian risk in comparison with one-by-one optimal processing as the total number of processed data items goes to infinity.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.05531/full.md

## References

5 references — full list in the complete paper: https://tomesphere.com/paper/1908.05531/full.md

---
Source: https://tomesphere.com/paper/1908.05531