Hedging Algorithms and Repeated Matrix Games
Bruno Bouzy, Marc M\'etivier, Damien Pellier

TL;DR
This paper demonstrates that well-designed hedging algorithms, especially two-level ones combining $S$, $UCB$, and $M3$, outperform previous multi-agent learning algorithms in repeated matrix games.
Contribution
It introduces and experimentally validates the effectiveness of multi-level hedging algorithms for repeated matrix games, showing their superiority over existing methods.
Findings
Two-level hedging algorithms outperform single-level ones.
$S$ is an effective top-level algorithm.
$UCB$ and $M3$ are strong basic algorithms.
Abstract
Playing repeated matrix games (RMG) while maximizing the cumulative returns is a basic method to evaluate multi-agent learning (MAL) algorithms. Previous work has shown that , , or algorithms have good behaviours on average in RMG. Besides, hedging algorithms have been shown to be effective on prediction problems. An hedging algorithm is made up with a top-level algorithm and a set of basic algorithms. To make its decision, an hedging algorithm uses its top-level algorithm to choose a basic algorithm, and the chosen algorithm makes the decision. This paper experimentally shows that well-selected hedging algorithms are better on average than all previous MAL algorithms on the task of playing RMG against various players. is a very good top-level algorithm, and and are very good basic algorithms. Furthermore, two-level hedging algorithms are more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Artificial Intelligence in Games
