# Stacked Thompson Bandits

**Authors:** Lenz Belzner, Thomas Gabor

arXiv: 1702.08726 · 2017-03-01

## TL;DR

Stacked Thompson Bandits (STB) is a Bayesian method that efficiently generates plans satisfying bounded temporal logic requirements by stacking multi-armed bandits and using Thompson sampling to guide search.

## Contribution

The paper introduces STB, a novel approach combining stacked bandits and Thompson sampling for efficient plan generation under temporal logic constraints.

## Key findings

- STB achieves high probability satisfaction of requirements.
- STB searches only a fraction of the space.
- The method outperforms baseline approaches.

## Abstract

We introduce Stacked Thompson Bandits (STB) for efficiently generating plans that are likely to satisfy a given bounded temporal logic requirement. STB uses a simulation for evaluation of plans, and takes a Bayesian approach to using the resulting information to guide its search. In particular, we show that stacking multiarmed bandits and using Thompson sampling to guide the action selection process for each bandit enables STB to generate plans that satisfy requirements with a high probability while only searching a fraction of the search space.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1702.08726/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1702.08726/full.md

## References

20 references — full list in the complete paper: https://tomesphere.com/paper/1702.08726/full.md

---
Source: https://tomesphere.com/paper/1702.08726