Prior-Agnostic Incentive-Compatible Exploration
Ramya Ramalingam, Osbert Bastani, Aaron Roth

TL;DR
This paper introduces a method for incentivizing agents to follow exploration strategies in bandit problems without requiring a common prior, using swap regret bounds to ensure approximate equilibrium in dynamic, multi-agent environments.
Contribution
It demonstrates that swap regret bounds can ensure incentive compatibility in dynamic bandit settings with conflicting priors, even without shared prior knowledge.
Findings
Swap regret bounds lead to approximate Bayes Nash equilibrium.
Agents with some uncertainty about their arrival time follow forecasts faithfully.
The approach applies to adaptive and weighted regret algorithms in bandit problems.
Abstract
In bandit settings, optimizing long-term regret metrics requires exploration, which corresponds to sometimes taking myopically sub-optimal actions. When a long-lived principal merely recommends actions to be executed by a sequence of different agents (as in an online recommendation platform) this provides an incentive misalignment: exploration is "worth it" for the principal but not for the agents. Prior work studies regret minimization under the constraint of Bayesian Incentive-Compatibility in a static stochastic setting with a fixed and common prior shared amongst the agents and the algorithm designer. We show that (weighted) swap regret bounds on their own suffice to cause agents to faithfully follow forecasts in an approximate Bayes Nash equilibrium, even in dynamic environments in which agents have conflicting prior beliefs and the mechanism designer has no knowledge of any…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications · Auction Theory and Applications
