Estimating and Incentivizing Imperfect-Knowledge Agents with Hidden   Rewards

Ilgin Dogan; Zuo-Jun Max Shen; Anil Aswani

arXiv:2308.06717·cs.LG·August 15, 2023

Estimating and Incentivizing Imperfect-Knowledge Agents with Hidden Rewards

Ilgin Dogan, Zuo-Jun Max Shen, Anil Aswani

PDF

Open Access

TL;DR

This paper studies a complex principal-agent setting where the principal cannot observe the agent's rewards and both parties learn over time, proposing a new estimator and incentive policy with proven theoretical guarantees.

Contribution

It introduces a non-parametric estimator and a data-driven incentive policy for a repeated learning game with hidden rewards, providing finite-sample guarantees and regret bounds.

Findings

01

Estimator achieves finite-sample consistency.

02

Principal's regret is rigorously bounded.

03

Framework applicable to green energy contracts.

Abstract

In practice, incentive providers (i.e., principals) often cannot observe the reward realizations of incentivized agents, which is in contrast to many principal-agent models that have been previously studied. This information asymmetry challenges the principal to consistently estimate the agent's unknown rewards by solely watching the agent's decisions, which becomes even more challenging when the agent has to learn its own rewards. This complex setting is observed in various real-life scenarios ranging from renewable energy storage contracts to personalized healthcare incentives. Hence, it offers not only interesting theoretical questions but also wide practical relevance. This paper explores a repeated adverse selection game between a self-interested learning agent and a learning principal. The agent tackles a multi-armed bandit (MAB) problem to maximize their expected reward plus…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Smart Grid Energy Management