Learning in Repeated Multi-Objective Stackelberg Games with Payoff Manipulation
Phurinut Srisawad, Juergen Branke, Long Tran-Thanh

TL;DR
This paper investigates how a leader can strategically manipulate payoffs in repeated multi-objective Stackelberg games to influence follower responses, balancing preference elicitation and utility maximization.
Contribution
It introduces novel manipulation policies based on expected utility and long-term expected utility, with theoretical convergence guarantees and empirical validation.
Findings
LongEU converges to optimal manipulation in infinite interactions.
Proposed policies improve leader utility and promote beneficial outcomes.
Approach works without explicit negotiation or prior utility knowledge.
Abstract
We study payoff manipulation in repeated multi-objective Stackelberg games, where a leader may strategically influence a follower's deterministic best response, e.g., by offering a share of their own payoff. We assume that the follower's utility function, representing preferences over multiple objectives, is unknown but linear, and its weight parameter must be inferred through interaction. This introduces a sequential decision-making challenge for the leader, who must balance preference elicitation with immediate utility maximisation. We formalise this problem and propose manipulation policies based on expected utility (EU) and long-term expected utility (longEU), which guide the leader in selecting actions and offering incentives that trade off short-term gains with long-term impact. We prove that under infinite repeated interactions, longEU converges to the optimal manipulation.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
