Social Welfare under Heterogeneous Time Preferences
Sarvin Bahmani, Soumyajit Paul, Sven Schewe, Shadi Tasdighi Kalat, Ashutosh Trivedi

TL;DR
This paper explores decision-making in multi-principal MDPs with diverse time preferences, proposing a framework for maximizing social welfare and analyzing the complexity of optimal strategies.
Contribution
It introduces heterogeneous time preferences into MDPs, characterizes optimal strategies as finite-memory, and examines computational complexity of strategy synthesis.
Findings
Optimal strategies are non-positional under heterogeneous preferences.
Optimal strategies can be implemented with polynomial memory and synthesized efficiently.
Deciding threshold questions for positional strategies is NP-hard.
Abstract
In several socioeconomic-critical decision-making settings, such as fair resource allocation, climate policy, or AI alignment, multiple principals interact within a common arena. While it is well established that these principals may have differing preferences, decision-making under heterogeneous time preferences remains relatively unexplored. In particular, principals may weigh future outcomes differently and may derive distinct utilities from the same decisions. Motivated by such scenarios, we introduce the notion of heterogeneous time preferences in MDPs, where multiple principals possess distinct reward functions and apply different discount factors to future rewards. To compute meaningful decisions in such settings, an AI agent must rely on a notion of optimality that accounts for the preferences of all principals. We adopt a utilitarian notion of social welfare, defined as the sum…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
