Decentralized Online Learning in General-Sum Stackelberg Games
Yaolong Yu, Haipeng Chen

TL;DR
This paper investigates decentralized online learning in general-sum Stackelberg games, revealing how different information settings influence follower strategies and convergence, with new algorithms and empirical validation.
Contribution
It introduces novel strategies for followers in different information settings and provides convergence and sample complexity results for decentralized learning in Stackelberg games.
Findings
Myopic best response is optimal in limited info setting.
Followers can manipulate leader signals with side info.
New manipulation strategy outperforms best response in side info setting.
Abstract
We study an online learning problem in general-sum Stackelberg games, where players act in a decentralized and strategic manner. We study two settings depending on the type of information for the follower: (1) the limited information setting where the follower only observes its own reward, and (2) the side information setting where the follower has extra side information about the leader's reward. We show that for the follower, myopically best responding to the leader's action is the best strategy for the limited information setting, but not necessarily so for the side information setting -- the follower can manipulate the leader's reward signals with strategic actions, and hence induce the leader's strategy to converge to an equilibrium that is better off for itself. Based on these insights, we study decentralized online learning for both players in the two settings. Our main…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications · Distributed Sensor Networks and Detection Algorithms
