The Algorithmic Advantage: How Reinforcement Learning Generates Rich Communication
Emilio Calvano, Clemens Possnig, Juha Tolvanen

TL;DR
This paper explores how reinforcement learning influences strategic communication, showing that it can lead to informative exchanges even with initially uninformative policies or generate cycles with high payoffs when preferences are misaligned.
Contribution
It provides a theoretical analysis of how reinforcement learning affects communication outcomes in the cheap-talk framework, highlighting conditions for informative communication and cyclical dynamics.
Findings
Learning leads to informative communication with aligned preferences.
No stable equilibrium exists with misaligned preferences, resulting in cycles.
Cycles can sustain high payoffs exceeding static equilibria.
Abstract
We analyze strategic communication when advice is generated by a reinforcement-learning algorithm rather than by a fully rational sender. Building on the cheap-talk framework of Crawford and Sobel (1982), an advisor adapts its messages based on payoff feedback, while a decision maker best-responds. We provide a theoretical analysis of the long-run communication outcomes induced by such reward-driven adaptation. With aligned preferences, we establish that learning robustly leads to informative communication even from uninformative initial policies. With misaligned preferences, no stable outcome exists; instead, learning generates cycles that sustain highly informative communication and payoffs exceeding those of any static equilibrium.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGame Theory and Applications · Media Influence and Politics · Experimental Behavioral Economics Studies
