Maximizing utility in multi-agent environments by anticipating the behavior of other learners
Angelos Assos, Yuval Dagan, Constantinos Daskalakis

TL;DR
This paper explores how an optimizer can strategically plan in multi-agent repeated games to maximize utility by anticipating a learner's behavior, providing algorithms for zero-sum games and showing computational hardness in general-sum games.
Contribution
It introduces an algorithm for the optimizer in zero-sum games against replicator dynamics and analyzes the computational complexity of utility maximization in general-sum games.
Findings
Optimizer can maximize utility in zero-sum games against replicator dynamics.
No FPTAS exists for utility maximization against best-responding learners unless P=NP.
Algorithms can guarantee higher average utility than the one-shot game in certain settings.
Abstract
Learning algorithms are often used to make decisions in sequential decision-making environments. In multi-agent settings, the decisions of each agent can affect the utilities/losses of the other agents. Therefore, if an agent is good at anticipating the behavior of the other agents, in particular how they will make decisions in each round as a function of their experience that far, it could try to judiciously make its own decisions over the rounds of the interaction so as to influence the other agents to behave in a way that ultimately benefits its own utility. In this paper, we study repeated two-player games involving two types of agents: a learner, which employs an online learning algorithm to choose its strategy in each round; and an optimizer, which knows the learner's utility function and the learner's online learning algorithm. The optimizer wants to plan ahead to maximize its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Fuzzy Logic and Control Systems · Data Stream Mining Techniques
