Learning to Incentivize in Repeated Principal-Agent Problems with Adversarial Agent Arrivals

Junyan Liu; Arnab Maiti; Artin Tajdini; Kevin Jamieson; Lillian J. Ratliff

arXiv:2505.23124·cs.GT·August 5, 2025

Learning to Incentivize in Repeated Principal-Agent Problems with Adversarial Agent Arrivals

Junyan Liu, Arnab Maiti, Artin Tajdini, Kevin Jamieson, Lillian J. Ratliff

PDF

Open Access 1 Video

TL;DR

This paper studies a repeated principal-agent problem with adversarial agent arrivals, proposing algorithms with sublinear regret bounds under different assumptions about agent responses, and extending to multiple incentives per round.

Contribution

It introduces the first algorithms with provable regret bounds for adversarial, repeated principal-agent problems with unknown agent behaviors and multiple incentives.

Findings

01

Achieves $O(rac{ ext{polylog}(N)}{ ext{poly}(T)})$ regret bounds under known greedy responses.

02

Establishes lower bounds matching the upper bounds up to logarithmic factors.

03

Extends algorithms to incentivize multiple arms simultaneously in each round.

Abstract

We initiate the study of a repeated principal-agent problem over a finite horizon $T$ , where a principal sequentially interacts with $K \geq 2$ types of agents arriving in an adversarial order. At each round, the principal strategically chooses one of the $N$ arms to incentivize for an arriving agent of unknown type. The agent then chooses an arm based on its own utility and the provided incentive, and the principal receives a corresponding reward. The objective is to minimize regret against the best incentive in hindsight. Without prior knowledge of agent behavior, we show that the problem becomes intractable, leading to linear regret. We analyze two key settings where sublinear regret is achievable. In the first setting, the principal knows the arm each agent type would select greedily for any given incentive. Under this setting, we propose an algorithm that achieves a regret bound of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Learning to Incentivize in Repeated Principal-Agent Problems with Adversarial Agent Arrivals· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications · Optimization and Search Problems