Multi-Agent Q-Learning for Real-Time Load Balancing User Association and Handover in Mobile Networks
Alireza Alizadeh, Byungju Lim, Mai Vu

TL;DR
This paper introduces multi-agent online Q-learning algorithms for real-time load balancing, user association, and handover management in dense cellular networks, improving stability and performance.
Contribution
It proposes novel centralized and distributed multi-agent Q-learning policies that adapt to network dynamics and ensure load balancing during user association and handover.
Findings
Outperforms 3GPP max-SINR association in simulations.
Demonstrates robustness across various user mobility profiles.
Achieves low-complexity and fast convergence in dynamic environments.
Abstract
As next generation cellular networks become denser, associating users with the optimal base stations at each time while ensuring no base station is overloaded becomes critical for achieving stable and high network performance. We propose multi-agent online Q-learning (QL) algorithms for performing real-time load balancing user association and handover in dense cellular networks. The load balancing constraints at all base stations couple the actions of user agents, and we propose two multi-agent action selection policies, one centralized and one distributed, to satisfy load balancing at every learning step. In the centralized policy, the actions of UEs are determined by a central load balancer (CLB) running an algorithm based on swapping the worst connection to maximize the total learning reward. In the distributed policy, each UE takes an action based on its local information by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWireless Networks and Protocols · Advanced Wireless Network Optimization · Wireless Communication Networks Research
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Q-Learning · Balanced Selection
