Learning to Persuade a Biased Receiver

Yuqi Pan; Sadie Zhao; Milind Tambe; Yiling Chen

arXiv:2605.15331·cs.GT·May 18, 2026

Learning to Persuade a Biased Receiver

Yuqi Pan, Sadie Zhao, Milind Tambe, Yiling Chen

PDF

TL;DR

This paper develops a learning algorithm for a sender to effectively persuade a receiver with biased belief updates in repeated interactions, achieving near-optimal regret bounds.

Contribution

It introduces a safe exploration algorithm that learns the receiver's bias while maintaining high persuasion value in a complex, repeated information design setting.

Findings

01

Achieves $O(\log\log T)$ regret in learning the receiver's bias.

02

Proves a matching lower bound of $\Omega(\log\log T)$, confirming optimality.

03

Extends to settings with unknown prior, bias, and time-varying utilities.

Abstract

We study a repeated information design setting in which the receiver, who is also the decision-maker, updates beliefs in a systematically biased way. More specifically, a distorted posterior in our model can be written as a convex combination of the prior and the Bayesian posterior, governed by a fixed but unknown parameter. Over repeated interactions, the sender chooses persuasive signaling schemes, observes only the receiver's realized actions, and seeks to minimize regret relative to a full-information oracle that knows the receiver's biased updating rule. We propose a safe exploration algorithm for learning the receiver's bias while maintaining high persuasion value. The algorithm exploits the asymmetric cost of probing: conservative probes incur only local loss, whereas overly aggressive probes may lose the persuasive opportunity entirely. For general finite state and action spaces…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.