# Path Planning Problems with Side Observations-When Colonels Play   Hide-and-Seek

**Authors:** Dong Quan Vu, Patrick Loiseau, Alonso Silva, Long Tran-Thanh

arXiv: 1905.11151 · 2019-12-05

## TL;DR

This paper models resource allocation games like Colonel Blotto and Hide-and-Seek as path planning problems with side observations, proposing a novel efficient algorithm with regret guarantees and demonstrating its effectiveness in these games.

## Contribution

It introduces EXP3-OE, the first efficient algorithm for SOPPP without auxiliary oracles, with proven regret bounds and applicability to complex resource allocation games.

## Key findings

- EXP3-OE achieves regret bounds matching the best benchmarks.
- The algorithm is computationally efficient without auxiliary oracles.
- Applying EXP3-OE to CB and HS games improves learning performance.

## Abstract

Resource allocation games such as the famous Colonel Blotto (CB) and Hide-and-Seek (HS) games are often used to model a large variety of practical problems, but only in their one-shot versions. Indeed, due to their extremely large strategy space, it remains an open question how one can efficiently learn in these games. In this work, we show that the online CB and HS games can be cast as path planning problems with side-observations (SOPPP): at each stage, a learner chooses a path on a directed acyclic graph and suffers the sum of losses that are adversarially assigned to the corresponding edges; and she then receives semi-bandit feedback with side-observations (i.e., she observes the losses on the chosen edges plus some others). We propose a novel algorithm, EXP3-OE, the first-of-its-kind with guaranteed efficient running time for SOPPP without requiring any auxiliary oracle. We provide an expected-regret bound of EXP3-OE in SOPPP matching the order of the best benchmark in the literature. Moreover, we introduce additional assumptions on the observability model under which we can further improve the regret bounds of EXP3-OE. We illustrate the benefit of using EXP3-OE in SOPPP by applying it to the online CB and HS games.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.11151/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/1905.11151/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/1905.11151/full.md

---
Source: https://tomesphere.com/paper/1905.11151