Resource Allocation Among Agents with MDP-Induced Preferences
D. A. Dolgov, E. H. Durfee

TL;DR
This paper introduces an efficient algorithm for resource allocation among agents with MDP-based preferences, significantly reducing computational complexity and enabling practical solutions for large-scale problems.
Contribution
The paper presents a novel algorithm that jointly solves resource allocation and policy optimization in MDP settings, avoiding exponential utility representations.
Findings
Efficiently solves large resource allocation problems in minutes.
Reduces complexity from exponential to manageable levels.
Demonstrates effectiveness on problems with up to 2^100 resource bundles.
Abstract
Allocating scarce resources among agents to maximize global utility is, in general, computationally challenging. We focus on problems where resources enable agents to execute actions in stochastic environments, modeled as Markov decision processes (MDPs), such that the value of a resource bundle is defined as the expected value of the optimal MDP policy realizable given these resources. We present an algorithm that simultaneously solves the resource-allocation and the policy-optimization problems. This allows us to avoid explicitly representing utilities over exponentially many resource bundles, leading to drastic (often exponential) reductions in computational complexity. We then use this algorithm in the context of self-interested agents to design a combinatorial auction for allocating resources. We empirically demonstrate the effectiveness of our approach by showing that it can, in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
