Stochastic Network Utility Maximization with Unknown Utilities: Multi-Armed Bandits Approach
Arun Verma, Manjesh K. Hanawal

TL;DR
This paper addresses a stochastic network utility maximization problem with unknown agent utilities, using multi-armed bandit algorithms to optimize resource allocation under threshold utility functions.
Contribution
It introduces a novel bandit-based approach for network utility maximization with unknown utilities, providing algorithms and theoretical guarantees.
Findings
Proposed algorithms are optimal for identical agent utilities.
Algorithms achieve low regret in resource allocation.
Numerical experiments validate theoretical performance guarantees.
Abstract
In this paper, we study a novel Stochastic Network Utility Maximization (NUM) problem where the utilities of agents are unknown. The utility of each agent depends on the amount of resource it receives from a network operator/controller. The operator desires to do a resource allocation that maximizes the expected total utility of the network. We consider threshold type utility functions where each agent gets non-zero utility if the amount of resource it receives is higher than a certain threshold. Otherwise, its utility is zero (hard real-time). We pose this NUM setup with unknown utilities as a regret minimization problem. Our goal is to identify a policy that performs as `good' as an oracle policy that knows the utilities of agents. We model this problem setting as a bandit setting where feedback obtained in each round depends on the resource allocated to the agents. We propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
