Reinforcement Learning for Game-Theoretic Resource Allocation on Graphs
Zijian An, Lifeng Zhou

TL;DR
This paper applies reinforcement learning techniques, specifically DQN and PPO, to solve a complex multi-step game-theoretic resource allocation problem on graphs, demonstrating superior performance over baseline strategies.
Contribution
It formulates the game as an MDP, introduces a dynamic action set generation method, and evaluates RL algorithms on various graph structures, advancing game-theoretic resource allocation methods.
Findings
RL agents outperform baseline strategies.
RL agents achieve balanced 50% win rate against each other.
RL adapts to structural advantages on asymmetric graphs.
Abstract
Game-theoretic resource allocation on graphs (GRAG) involves two players competing over multiple steps to control nodes of interest on a graph, a problem modeled as a multi-step Colonel Blotto Game (MCBG). Finding optimal strategies is challenging due to the dynamic action space and structural constraints imposed by the graph. To address this, we formulate the MCBG as a Markov Decision Process (MDP) and apply Reinforcement Learning (RL) methods, specifically Deep Q-Network (DQN) and Proximal Policy Optimization (PPO). To enforce graph constraints, we introduce an action-displacement adjacency matrix that dynamically generates valid action sets at each step. We evaluate RL performance across a variety of graph structures and initial resource distributions, comparing against random, greedy, and learned RL policies. Experimental results show that both DQN and PPO consistently outperform…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Reinforcement Learning in Robotics · Game Theory and Applications
MethodsConvolution · Q-Learning · Dense Connections · Deep Q-Network · Entropy Regularization · Proximal Policy Optimization
