Learning in Games: Robustness of Fast Convergence

Dylan J. Foster; Zhiyuan Li; Thodoris Lykouris; Karthik Sridharan; Eva; Tardos

arXiv:1606.06244·cs.GT·December 19, 2016·20 cites

Learning in Games: Robustness of Fast Convergence

Dylan J. Foster, Zhiyuan Li, Thodoris Lykouris, Karthik Sridharan, Eva, Tardos

PDF

Open Access

TL;DR

This paper demonstrates that learning algorithms with low approximate regret lead to rapid convergence to near-optimal outcomes in repeated games, broadening the scope of applicable algorithms and feedback models.

Contribution

It introduces a low approximate regret property that ensures fast convergence in various game settings, including bandit feedback and dynamic populations, with improved speed and broader applicability.

Findings

01

Fast convergence with high probability in repeated games

02

Applicability to bandit feedback and dynamic populations

03

Improved convergence speed proportional to the number of players

Abstract

We show that learning algorithms satisfying a $low approximate regret$ property experience fast convergence to approximate optimality in a large class of repeated games. Our property, which simply requires that each learner has small regret compared to a $(1 + ϵ)$ -multiplicative approximation to the best action in hindsight, is ubiquitous among learning algorithms; it is satisfied even by the vanilla Hedge forecaster. Our results improve upon recent work of Syrgkanis et al. [SALS15] in a number of ways. We require only that players observe payoffs under other players' realized actions, as opposed to expected payoffs. We further show that convergence occurs with high probability, and show convergence under bandit feedback. Finally, we improve upon the speed of convergence by a factor of $n$ , the number of players. Both the scope of settings and the class of algorithms for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Data Stream Mining Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings