# Reactive learning strategies for iterated games

**Authors:** Alex McAvoy, Martin A. Nowak

arXiv: 1903.04443 · 2022-02-18

## TL;DR

This paper introduces reactive learning strategies for iterated games, showing they can effectively restrict payoff regions and are mathematically equivalent to certain memory-one strategies, with implications for game outcome control.

## Contribution

It establishes a formal connection between reactive learning strategies and memory-one strategies, demonstrating their effectiveness in constraining feasible payoffs in iterated games.

## Key findings

- Feasible payoff region is convex hull of at most 11 points against a memory-one strategy.
- Reactive learning strategies' payoff regions are subsets of those from equivalent memory-one strategies.
- Reactive learning strategies are powerful tools for outcome restriction in iterated games.

## Abstract

In an iterated game between two players, there is much interest in characterizing the set of feasible payoffs for both players when one player uses a fixed strategy and the other player is free to switch. Such characterizations have led to extortionists, equalizers, partners, and rivals. Most of those studies use memory-one strategies, which specify the probabilities to take actions depending on the outcome of the previous round. Here, we consider "reactive learning strategies," which gradually modify their propensity to take certain actions based on past actions of the opponent. Every linear reactive learning strategy, $\mathbf{p}^{\ast}$, corresponds to a memory one-strategy, $\mathbf{p}$, and vice versa. We prove that for evaluating the region of feasible payoffs against a memory-one strategy, $\mathcal{C}\left(\mathbf{p}\right)$, we need to check its performance against at most $11$ other strategies. Thus, $\mathcal{C}\left(\mathbf{p}\right)$ is the convex hull in $\mathbb{R}^{2}$ of at most $11$ points. Furthermore, if $\mathbf{p}$ is a memory-one strategy, with feasible payoff region $\mathcal{C}\left(\mathbf{p}\right)$, and $\mathbf{p}^{\ast}$ is the corresponding reactive learning strategy, with feasible payoff region $\mathcal{C}\left(\mathbf{p}^{\ast}\right)$, then $\mathcal{C}\left(\mathbf{p}^{\ast}\right)$ is a subset of $\mathcal{C}\left(\mathbf{p}\right)$. Reactive learning strategies are therefore powerful tools in restricting the outcomes of iterated games.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.04443/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1903.04443/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/1903.04443/full.md

---
Source: https://tomesphere.com/paper/1903.04443