Maximizing utility in multi-agent environments by anticipating the   behavior of other learners

Angelos Assos; Yuval Dagan; Constantinos Daskalakis

arXiv:2407.04889·cs.GT·July 9, 2024

Maximizing utility in multi-agent environments by anticipating the behavior of other learners

Angelos Assos, Yuval Dagan, Constantinos Daskalakis

PDF

Open Access 1 Video

TL;DR

This paper explores how an optimizer can strategically plan in multi-agent repeated games to maximize utility by anticipating a learner's behavior, providing algorithms for zero-sum games and showing computational hardness in general-sum games.

Contribution

It introduces an algorithm for the optimizer in zero-sum games against replicator dynamics and analyzes the computational complexity of utility maximization in general-sum games.

Findings

01

Optimizer can maximize utility in zero-sum games against replicator dynamics.

02

No FPTAS exists for utility maximization against best-responding learners unless P=NP.

03

Algorithms can guarantee higher average utility than the one-shot game in certain settings.

Abstract

Learning algorithms are often used to make decisions in sequential decision-making environments. In multi-agent settings, the decisions of each agent can affect the utilities/losses of the other agents. Therefore, if an agent is good at anticipating the behavior of the other agents, in particular how they will make decisions in each round as a function of their experience that far, it could try to judiciously make its own decisions over the rounds of the interaction so as to influence the other agents to behave in a way that ultimately benefits its own utility. In this paper, we study repeated two-player games involving two types of agents: a learner, which employs an online learning algorithm to choose its strategy in each round; and an optimizer, which knows the learner's utility function and the learner's online learning algorithm. The optimizer wants to plan ahead to maximize its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Maximizing utility in multi-agent environments by anticipating the behavior of other learners· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Fuzzy Logic and Control Systems · Data Stream Mining Techniques