Adaptive Regret Minimization in Bounded-Memory Games

Jeremiah Blocki; Nicolas Christin; Anupam Datta; Arunesh; Sinha

arXiv:1111.2888·cs.GT·September 6, 2013

Adaptive Regret Minimization in Bounded-Memory Games

Jeremiah Blocki, Nicolas Christin, Anupam Datta, Arunesh, Sinha

PDF

Open Access

TL;DR

This paper introduces a new framework for regret minimization in bounded-memory games, addressing challenges of history-dependent rewards and proposing algorithms with efficiency guarantees under certain conditions.

Contribution

It defines k-adaptive regret for bounded memory games and develops algorithms for approximate regret minimization, highlighting complexity differences between perfect and imperfect information settings.

Findings

01

Hardness result for imperfect information games under NP=RP assumption.

02

Efficient algorithms for perfect and imperfect information games against oblivious adversaries.

03

Introduction of k-adaptive regret concept for history-dependent environments.

Abstract

Online learning algorithms that minimize regret provide strong guarantees in situations that involve repeatedly making decisions in an uncertain environment, e.g. a driver deciding what route to drive to work every day. While regret minimization has been extensively studied in repeated games, we study regret minimization for a richer class of games called bounded memory games. In each round of a two-player bounded memory-m game, both players simultaneously play an action, observe an outcome and receive a reward. The reward may depend on the last m outcomes as well as the actions of the players in the current round. The standard notion of regret for repeated games is no longer suitable because actions and rewards can depend on the history of play. To account for this generality, we introduce the notion of k-adaptive regret, which compares the reward obtained by playing actions prescribed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Complexity and Algorithms in Graphs · Optimization and Search Problems