Truthful mechanisms for linear bandit games with private contexts
Yiting Hu, Lingjie Duan

TL;DR
This paper introduces a new mechanism for linear contextual bandit games that ensures truthful reporting of private contexts, achieving low regret and improving reliability in applications like healthcare and personalized recommendations.
Contribution
It is the first to address private context misreporting in stochastic bandit games and proposes a linear program-based mechanism that guarantees truthfulness with logarithmic regret.
Findings
Mechanism ensures truthful context reporting.
Achieves $O( ext{ln} T)$ regret in theory.
Performs well in numerical experiments.
Abstract
The contextual bandit problem, where agents arrive sequentially with personal contexts and the system adapts its arm allocation decisions accordingly, has recently garnered increasing attention for enabling more personalized outcomes. However, in many healthcare and recommendation applications, agents have private profiles and may misreport their contexts to gain from the system. For example, in adaptive clinical trials, where hospitals sequentially recruit volunteers to test multiple new treatments and adjust plans based on volunteers' reported profiles such as symptoms and interim data, participants may misreport severe side effects like allergy and nausea to avoid perceived suboptimal treatments. We are the first to study this issue of private context misreporting in a stochastic contextual bandit game between the system and non-repeated agents. We show that traditional low-regret…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuction Theory and Applications · Advanced Bandit Algorithms Research · Game Theory and Applications
MethodsSoftmax · Attention Is All You Need
