Sequential Information Design: Markov Persuasion Process and Its   Efficient Reinforcement Learning

Jibang Wu; Zixuan Zhang; Zhe Feng; Zhaoran Wang; Zhuoran Yang; Michael; I. Jordan; Haifeng Xu

arXiv:2202.10678·cs.AI·February 23, 2022

Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning

Jibang Wu, Zixuan Zhang, Zhe Feng, Zhaoran Wang, Zhuoran Yang, Michael, I. Jordan, Haifeng Xu

PDF

1 Video

TL;DR

This paper introduces Markov persuasion processes for sequential information design, providing efficient algorithms for optimal signaling policies and extending to reinforcement learning with provable regret bounds.

Contribution

It proposes a novel Markov persuasion process model, develops an efficient RL algorithm with sublinear regret, and extends the approach to large state and outcome spaces using function approximation.

Findings

01

Efficient algorithms for optimal signaling in MPPs.

02

Sublinear regret bounds for the RL algorithm.

03

Successful application to large state and outcome spaces.

Abstract

In today's economy, it becomes important for Internet platforms to consider the sequential information design problem to align its long term interest with incentives of the gig service providers. This paper proposes a novel model of sequential information design, namely the Markov persuasion processes (MPPs), where a sender, with informational advantage, seeks to persuade a stream of myopic receivers to take actions that maximizes the sender's cumulative utilities in a finite horizon Markovian environment with varying prior and utility functions. Planning in MPPs thus faces the unique challenge in finding a signaling policy that is simultaneously persuasive to the myopic receivers and inducing the optimal long-term cumulative utilities of the sender. Nevertheless, in the population level where the model is known, it turns out that we can efficiently determine the optimal (resp.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning· youtube

Taxonomy

Methodstravel james