The Sample Complexity of Online Contract Design
Banghua Zhu, Stephen Bates, Zhuoran Yang, Yixin Wang, Jiantao Jiao,, and Michael I. Jordan

TL;DR
This paper investigates the sample complexity of online contract design in principal-agent problems, providing tight bounds on regret and introducing new algorithms to handle discontinuous utility functions.
Contribution
It introduces an online learning algorithm with tight regret bounds for contract design, resolving open problems on sample complexity and handling discontinuities in utility functions.
Findings
Upper bound on Stackelberg regret: O((\,m\,) (T^{1-1/(2m+1)}))
Lower bound on regret: (T^{1-1/(m+2)})
Exact (T^{2/3}) regret for linear contracts
Abstract
We study the hidden-action principal-agent problem in an online setting. In each round, the principal posts a contract that specifies the payment to the agent based on each outcome. The agent then makes a strategic choice of action that maximizes her own utility, but the action is not directly observable by the principal. The principal observes the outcome and receives utility from the agent's choice of action. Based on past observations, the principal dynamically adjusts the contracts with the goal of maximizing her utility. We introduce an online learning algorithm and provide an upper bound on its Stackelberg regret. We show that when the contract space is , the Stackelberg regret is upper bounded by , and lower bounded by , where omits logarithmic factors. This result shows that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems
