# Introduction to Multi-Armed Bandits

**Authors:** Aleksandrs Slivkins

arXiv: 1904.07272 · 2024-04-05

## TL;DR

This book offers an accessible, comprehensive introduction to multi-armed bandits, covering theoretical foundations, various reward models, and applications in economics, with self-contained chapters and exercises for learners.

## Contribution

It provides a structured, textbook-like overview of multi-armed bandits, including recent extensions and connections to economics, suitable for educational purposes.

## Key findings

- Covers IID and adversarial reward models comprehensively
- Includes chapters on contextual bandits and economic applications
- Provides exercises and standalone surveys on advanced topics

## Abstract

Multi-armed bandits a simple but very powerful framework for algorithms that make decisions over time under uncertainty. An enormous body of work has accumulated over the years, covered in several books and surveys. This book provides a more introductory, textbook-like treatment of the subject. Each chapter tackles a particular line of work, providing a self-contained, teachable technical introduction and a brief review of the further developments; many of the chapters conclude with exercises.   The book is structured as follows. The first four chapters are on IID rewards, from the basic model to impossibility results to Bayesian priors to Lipschitz rewards. The next three chapters cover adversarial rewards, from the full-feedback version to adversarial bandits to extensions with linear rewards and combinatorially structured actions. Chapter 8 is on contextual bandits, a middle ground between IID and adversarial bandits in which the change in reward distributions is completely explained by observable contexts. The last three chapters cover connections to economics, from learning in repeated games to bandits with supply/budget constraints to exploration in the presence of incentives. The appendix provides sufficient background on concentration and KL-divergence.   The chapters on "bandits with similarity information", "bandits with knapsacks" and "bandits and agents" can also be consumed as standalone surveys on the respective topics.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.07272/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1904.07272/full.md

---
Source: https://tomesphere.com/paper/1904.07272