# Lexicographic Multiarmed Bandit

**Authors:** Alihan H\"uy\"uk, Cem Tekin

arXiv: 1907.11605 · 2019-07-30

## TL;DR

This paper introduces algorithms for multiobjective multiarmed bandit problems with lexicographic objectives, achieving bounded and sublinear regret under various prior knowledge settings, and validates them through experiments.

## Contribution

It develops novel algorithms for lexicographic multiobjective bandits that achieve bounded and sublinear regret, extending bandit theory to complex multiobjective scenarios.

## Key findings

- Algorithms achieve expected regret uniformly bounded in time.
- Proposed methods attain sublinear gap-free regret in prior-free settings.
- Experimental results demonstrate effective performance across diverse problems.

## Abstract

We consider a multiobjective multiarmed bandit problem with lexicographically ordered objectives. In this problem, the goal of the learner is to select arms that are lexicographic optimal as much as possible without knowing the arm reward distributions beforehand. We capture this goal by defining a multidimensional form of regret that measures the loss of the learner due to not selecting lexicographic optimal arms, and then, consider two settings where the learner has prior information on the expected arm rewards. In the first setting, the learner only knows for each objective the lexicographic optimal expected reward. In the second setting, it only knows for each objective near-lexicographic optimal expected rewards. For both settings we prove that the learner achieves expected regret uniformly bounded in time. The algorithm we propose for the second setting also attains bounded regret for the multiarmed bandit with satisficing objectives. In addition, we also consider the harder prior-free case, and show that the learner can still achieve sublinear in time gap-free regret. Finally, we experimentally evaluate performance of the proposed algorithms in a variety of multiobjective learning problems.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.11605/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1907.11605/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/1907.11605/full.md

---
Source: https://tomesphere.com/paper/1907.11605