Learning Robust Scheduling with Search and Attention
David Sandberg, Tor Kvernvik, Francesco Davide Calabrese

TL;DR
This paper introduces a novel MU-MIMO scheduling method combining Monte Carlo Tree Search and Reinforcement Learning with self-attention, significantly outperforming traditional heuristics under uncertainty.
Contribution
It presents a new approach that treats MU-MIMO scheduling as a tree search problem, integrating self-attention into neural networks for improved decision-making.
Findings
Outperforms state-of-the-art heuristics in simulations
Handles measurement uncertainties effectively
Demonstrates scalability to complex scheduling scenarios
Abstract
Allocating physical layer resources to users based on channel quality, buffer size, requirements and constraints represents one of the central optimization problems in the management of radio resources. The solution space grows combinatorially with the cardinality of each dimension making it hard to find optimal solutions using an exhaustive search or even classical optimization algorithms given the stringent time requirements. This problem is even more pronounced in MU-MIMO scheduling where the scheduler can assign multiple users to the same time-frequency physical resources. Traditional approaches thus resort to designing heuristics that trade optimality in favor of feasibility of execution. In this work we treat the MU-MIMO scheduling problem as a tree-structured combinatorial problem and, borrowing from the recent successes of AlphaGo Zero, we investigate the feasibility of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Wireless Network Optimization · Advanced Wireless Communication Techniques · Advanced MIMO Systems Optimization
