Strategic Apple Tasting
Keegan Harris, Chara Podimata, Zhiwei Steven Wu

TL;DR
This paper introduces algorithms for online decision-making in strategic settings with apple-tasting feedback, achieving sublinear regret in stochastic and adversarial scenarios, applicable to bandit and classification problems.
Contribution
It formalizes the strategic decision-making problem with apple-tasting feedback and provides algorithms with provable regret bounds for stochastic and adversarial agent sequences.
Findings
Achieves $O(\sqrt{T})$ regret with stochastic agents.
Handles adversarial agents with $O(T^{(d+1)/(d+2)})$ regret.
Algorithms adapt to bandit feedback and strategic classification settings.
Abstract
Algorithmic decision-making in high-stakes domains often involves assigning decisions to agents with incentives to strategically modify their input to the algorithm. In addition to dealing with incentives, in many domains of interest (e.g. lending and hiring) the decision-maker only observes feedback regarding their policy for rounds in which they assign a positive decision to the agent; this type of feedback is often referred to as apple tasting (or one-sided) feedback. We formalize this setting as an online learning problem with apple-tasting feedback where a principal makes decisions about a sequence of agents, each of which is represented by a context that may be strategically modified. Our goal is to achieve sublinear strategic regret, which compares the performance of the principal to that of the best fixed policy in hindsight, if the agents were truthful when revealing their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Optimization and Search Problems
